DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning

Abstract

Although pre-trained language models (PLMs) have achieved state-of-the-artperformance on various natural language processing (NLP) tasks, they are shownto be lacking in knowledge when dealing with knowledge driven tasks. Despitethe many efforts made for injecting knowledge into PLMs, this problem remainsopen. To address the challenge, we propose \textbf{DictBERT}, a novel approachthat enhances PLMs with dictionary knowledge which is easier to acquire thanknowledge graph (KG). During pre-training, we present two novel pre-trainingtasks to inject dictionary knowledge into PLMs via contrastive learning:\textit{dictionary entry prediction} and \textit{entry descriptiondiscrimination}. In fine-tuning, we use the pre-trained DictBERT as a pluginknowledge base (KB) to retrieve implicit knowledge for identified entries in aninput sequence, and infuse the retrieved knowledge into the input to enhanceits representation via a novel extra-hop attention mechanism. We evaluate ourapproach on a variety of knowledge driven and language understanding tasks,including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.Experimental results demonstrate that our model can significantly improvetypical PLMs: it gains a substantial improvement of 0.5\%, 2.9\%, 9.0\%, 7.1\%and 3.3\% on BERT-large respectively, and is also effective on RoBERTa-large.

Quick Read (beta)

loading the full paper ...