In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

Abstract

Large language models (LLMs) demonstrate emergent in-context learningcapabilities, where they adapt to new tasks based on example demonstrations.However, in-context learning has seen limited effectiveness in many settings,is difficult to quantitatively control and takes up context window space. Toovercome these limitations, we propose an alternative approach that recastsin-context learning as in-context vectors (ICV). Using ICV has two steps. Wefirst use a forward pass on demonstration examples to create the in-contextvector from the latent embedding of the LLM. This vector captures essentialinformation about the intended task. On a new query, instead of addingdemonstrations to the prompt, we shift the latent states of the LLM using theICV. The ICV approach has several benefits: 1) it enables the LLM to moreeffectively follow the demonstration examples; 2) it's easy to control byadjusting the magnitude of the ICV; 3) it reduces the length of the prompt byremoving the in-context demonstrations; 4) ICV is computationally much moreefficient than fine-tuning. We demonstrate that ICV achieves better performancecompared to standard in-context learning and fine-tuning on diverse tasksincluding safety, style transfer, role-playing and formatting. Moreover, weshow that we can flexibly teach LLM to simultaneously follow different types ofinstructions by simple vector arithmetics on the corresponding ICVs.

Quick Read (beta)

loading the full paper ...