LERT: A Linguistically-motivated Pre-trained Language Model

Abstract

Pre-trained Language Model (PLM) has become a representative foundation modelin the natural language processing field. Most PLMs are trained withlinguistic-agnostic pre-training tasks on the surface form of the text, such asthe masked language model (MLM). To further empower the PLMs with richerlinguistic features, in this paper, we aim to propose a simple but effectiveway to learn linguistic features for pre-trained language models. We proposeLERT, a pre-trained language model that is trained on three types of linguisticfeatures along with the original MLM pre-training task, using alinguistically-informed pre-training (LIP) strategy. We carried out extensiveexperiments on ten Chinese NLU tasks, and the experimental results show thatLERT could bring significant improvements over various comparable baselines.Furthermore, we also conduct analytical experiments in various linguisticaspects, and the results prove that the design of LERT is valid and effective.Resources are available at https://github.com/ymcui/LERT

Quick Read (beta)

loading the full paper ...