Abstract
In recent years, pre-trained large language models have demonstratedremarkable efficiency in achieving an inference-time few-shot learningcapability known as in-context learning. However, existing literature hashighlighted the sensitivity of this capability to the selection of few-shotdemonstrations. The underlying mechanisms by which this capability arises fromregular language model pretraining objectives remain poorly understood. In thisstudy, we aim to examine the in-context learning phenomenon through a Bayesianlens, viewing large language models as topic models that implicitly infertask-related information from demonstrations. On this premise, we propose analgorithm for selecting optimal demonstrations from a set of annotated data anddemonstrate a significant 12.5% improvement relative to the random selectionbaseline, averaged over eight GPT2 and GPT3 models on eight differentreal-world text classification datasets. Our empirical findings support ourhypothesis that large language models implicitly infer a latent conceptvariable.