SentiLR: Linguistic Knowledge Enhanced Language Representation for Sentiment Analysis

Abstract

Most of the existing pre-trained language representation models neglect toconsider the linguistic knowledge of texts, whereas we argue that suchknowledge can promote language understanding in various NLP tasks. In thispaper, we propose a novel language representation model called SentiLR, whichintroduces word-level linguistic knowledge including part-of-speech tag andprior sentiment polarity from SentiWordNet to benefit the downstream tasks insentiment analysis. During pre-training, we first acquire the prior sentimentpolarity of each word by querying the SentiWordNet dictionary with itspart-of-speech tag. Then, we devise a new pre-training task called label-awaremasked language model (LA-MLM) consisting of two subtasks: 1) word knowledgerecovering given the sentence-level label; 2) sentence-level label predictionwith linguistic knowledge enhanced context. Experiments show that SentiLRachieves state-of-the-art performance on several sentence-level / aspect-levelsentiment analysis tasks by fine-tuning, and also obtain comparative results ongeneral language understanding tasks.

Quick Read (beta)

loading the full paper ...