CERT: Contrastive Self-supervised Learning for Language Understanding

Abstract

Pretrained language models such as BERT, GPT have shown great effectivenessin language understanding. The auxiliary predictive tasks in existingpretraining approaches are mostly defined on tokens, thus may not be able tocapture sentence-level semantics very well. To address this issue, we proposeCERT: Contrastive self-supervised Encoder Representations from Transformers,which pretrains language representation models using contrastiveself-supervised learning at the sentence level. CERT creates augmentations oforiginal sentences using back-translation. Then it finetunes a pretrainedlanguage encoder (e.g., BERT) by predicting whether two augmented sentencesoriginate from the same sentence. CERT is simple to use and can be flexiblyplugged into any pretraining-finetuning NLP pipeline. We evaluate CERT on 11natural language understanding tasks in the GLUE benchmark where CERToutperforms BERT on 7 tasks, achieves the same performance as BERT on 2 tasks,and performs worse than BERT on 2 tasks. On the averaged score of the 11 tasks,CERT outperforms BERT. The data and code are available athttps://github.com/UCSD-AI4H/CERT

Quick Read (beta)

loading the full paper ...