Interpreting Language Models Through Knowledge Graph Extraction

Abstract

Transformer-based language models trained on large text corpora have enjoyedimmense popularity in the natural language processing community and arecommonly used as a starting point for downstream tasks. While these models areundeniably useful, it is a challenge to quantify their performance beyondtraditional accuracy metrics. In this paper, we compare BERT-based languagemodels through snapshots of acquired knowledge at sequential stages of thetraining process. Structured relationships from training corpora may beuncovered through querying a masked language model with probing tasks. Wepresent a methodology to unveil a knowledge acquisition timeline by generatingknowledge graph extracts from cloze "fill-in-the-blank" statements at variousstages of RoBERTa's early training. We extend this analysis to a comparison ofpretrained variations of BERT models (DistilBERT, BERT-base, RoBERTa). Thiswork proposes a quantitative framework to compare language models throughknowledge graph extraction (GED, Graph2Vec) and showcases a part-of-speechanalysis (POSOR) to identify the linguistic strengths of each model variant.Using these metrics, machine learning practitioners can compare models,diagnose their models' behavioral strengths and weaknesses, and identify newtargeted datasets to improve model performance.

Quick Read (beta)

loading the full paper ...