Neural Machine Translation Models Can Learn to be Few-shot Learners

Abstract

The emergent ability of Large Language Models to use a small number ofexamples to learn to perform in novel domains and tasks, also called in-contextlearning (ICL). In this work, we show that a much smaller model can be trainedto perform ICL by fine-tuning towards a specialized training objective,exemplified on the task of domain adaptation for neural machine translation.With this capacity for ICL, the model can take advantage of relevant few-shotexamples to adapt its output towards the domain. We compare the quality of thisdomain adaptation to traditional supervised techniques and ICL with a40B-parameter Large Language Model. Our approach allows efficient batchinference on a mix of domains and outperforms state-of-the-art baselines interms of both translation quality and immediate adaptation rate, i.e. theability to reproduce a specific term after being shown a single example.

Quick Read (beta)

loading the full paper ...