Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

Abstract

Fine-tuned pre-trained language models (LMs) achieve enormous success in manynatural language processing (NLP) tasks, but they still require excessivelabeled data in the fine-tuning stage. We study the problem of fine-tuningpre-trained LMs using only weak supervision, without any labeled data. Thisproblem is challenging because the high capacity of LMs makes them prone tooverfitting the noisy labels generated by weak supervision. To address thisproblem, we develop a contrastive self-training framework, COSINE, to enablefine-tuning LMs with weak supervision. Underpinned by contrastiveregularization and confidence-based reweighting, this contrastive self-trainingframework can gradually improve model fitting while effectively suppressingerror propagation. Experiments on sequence, token, and sentence pairclassification tasks show that our model outperforms the strongest baseline bylarge margins on 7 benchmarks in 6 tasks, and achieves competitive performancewith fully-supervised fine-tuning methods.

Quick Read (beta)

loading the full paper ...