Is Your Language Model Ready for Dense Representation Fine-tuning?

  • 2021-04-16 17:36:44
  • Luyu Gao, Jamie Callan
  • 5

Abstract

Pre-trained language models (LM) have become go-to text representationencoders. Prior research used deep LMs to encode text sequences such assentences and passages into single dense vector representations. These denserepresentations have been used in efficient text comparison and embedding-basedretrieval. However, dense encoders suffer in low resource situations. Manytechniques have been developed to solve this problem. Despite their success,not much is known about why this happens. This paper shows that one cause liesin the readiness of the LM to expose its knowledge through dense representationin fine-tuning, which we term Optimization Readiness. To validate the theory,we present Condenser, a general pre-training architecture based on TransformerLMs, to improve dense optimization readiness. We show that fine-tuning fromCondenser significantly improves performance for small and/or noisy trainingsets.

 

Quick Read (beta)

loading the full paper ...