The broader spectrum of in-context learning

Abstract

The ability of language models to learn a task from a few examples in contexthas generated substantial interest. Here, we provide a perspective thatsituates this type of supervised few-shot learning within a much broaderspectrum of meta-learned in-context learning. Indeed, we suggest that anydistribution of sequences in which context non-trivially decreases loss onsubsequent predictions can be interpreted as eliciting a kind of in-contextlearning. We suggest that this perspective helps to unify the broad set ofin-context abilities that language models exhibit -- such as adapting to tasksfrom instructions or role play, or extrapolating time series. This perspectivealso sheds light on potential roots of in-context learning in lower-levelprocessing of linguistic dependencies (e.g. coreference or parallelstructures). Finally, taking this perspective highlights the importance ofgeneralization, which we suggest can be studied along several dimensions: notonly the ability to learn something novel, but also flexibility in learningfrom different presentations, and in applying what is learned. We discussbroader connections to past literature in meta-learning and goal-conditionedagents, and other perspectives on learning and adaptation. We close bysuggesting that research on in-context learning should consider this broaderspectrum of in-context capabilities and types of generalization.

Quick Read (beta)

loading the full paper ...