DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding

Abstract

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trainedmodels with relation triples injecting from knowledge graphs to improvelanguage understanding abilities. To guarantee effective knowledge injection,previous studies integrate models with knowledge encoders for representingknowledge retrieved from knowledge graphs. The operations for knowledgeretrieval and encoding bring significant computational burdens, restricting theusage of such models in real-world applications that require high inferencespeed. In this paper, we propose a novel KEPLM named DKPLM that DecomposesKnowledge injection process of the Pre-trained Language Models in pre-training,fine-tuning and inference stages, which facilitates the applications of KEPLMsin real-world scenarios. Specifically, we first detect knowledge-awarelong-tail entities as the target for knowledge injection, enhancing the KEPLMs'semantic understanding abilities and avoiding injecting redundant information.The embeddings of long-tail entities are replaced by "pseudo tokenrepresentations" formed by relevant knowledge triples. We further design therelational knowledge decoding task for pre-training to force the models totruly understand the injected knowledge by relation triple reconstruction.Experiments show that our model outperforms other KEPLMs significantly overzero-shot knowledge probing tasks and multiple knowledge-aware languageunderstanding tasks. We further show that DKPLM has a higher inference speedthan other competing models due to the decomposing mechanism.

Quick Read (beta)

loading the full paper ...