Abstract
Language modeling studies the probability distributions over strings oftexts. It is one of the most fundamental tasks in natural language processing(NLP). It has been widely used in text generation, speech recognition, machinetranslation, etc. Conventional language models (CLMs) aim to predict theprobability of linguistic sequences in a causal manner. In contrast,pre-trained language models (PLMs) cover broader concepts and can be used inboth causal sequential modeling and fine-tuning for downstream applications.PLMs have their own training paradigms (usually self-supervised) and serve asfoundation models in modern NLP systems. This overview paper provides anintroduction to both CLMs and PLMs from five aspects, i.e., linguistic units,structures, training methods, evaluation methods, and applications.Furthermore, we discuss the relationship between CLMs and PLMs and shed lighton the future directions of language modeling in the pre-trained era.