Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

  • 2020-09-17 13:00:13
  • Ieva Staliūnaitė, Ignacio Iacobacci
  • 25

Abstract

Many NLP tasks have benefited from transferring knowledge from contextualizedword embeddings, however the picture of what type of knowledge is transferredis incomplete. This paper studies the types of linguistic phenomena accountedfor by language models in the context of a Conversational Question Answering(CoQA) task. We identify the problematic areas for the finetuned RoBERTa, BERTand DistilBERT models through systematic error analysis - basic arithmetic(counting phrases), compositional semantics (negation and Semantic RoleLabeling), and lexical semantics (surprisal and antonymy). When enhanced withthe relevant linguistic knowledge through multitask learning, the modelsimprove in performance. Ensembles of the enhanced models yield a boost between2.2 and 2.7 points in F1 score overall, and up to 42.1 points in F1 on thehardest question classes. The results show differences in ability to representcompositional and lexical information between RoBERTa, BERT and DistilBERT.

 

Quick Read (beta)

loading the full paper ...