KALA: Knowledge-Augmented Language Model Adaptation

Abstract

Pre-trained language models (PLMs) have achieved remarkable success onvarious natural language understanding tasks. Simple fine-tuning of PLMs, onthe other hand, might be suboptimal for domain-specific tasks because theycannot possibly cover knowledge from all domains. While adaptive pre-trainingof PLMs can help them obtain domain-specific knowledge, it requires a largetraining cost. Moreover, adaptive pre-training can harm the PLM's performanceon the downstream task by causing catastrophic forgetting of its generalknowledge. To overcome such limitations of adaptive pre-training for PLMadaption, we propose a novel domain adaption framework for PLMs coined asKnowledge-Augmented Language model Adaptation (KALA), which modulates theintermediate hidden representations of PLMs with domain knowledge, consistingof entities and their relational facts. We validate the performance of our KALAon question answering and named entity recognition tasks on multiple datasetsacross various domains. The results show that, despite being computationallyefficient, our KALA largely outperforms adaptive pre-training. Code isavailable at: https://github.com/Nardien/KALA/.

Quick Read (beta)

loading the full paper ...