LIMA: Less Is More for Alignment

Abstract

Large language models are trained in two stages: (1) unsupervised pretrainingfrom raw text, to learn general-purpose representations, and (2) large scaleinstruction tuning and reinforcement learning, to better align to end tasks anduser preferences. We measure the relative importance of these two stages bytraining LIMA, a 65B parameter LLaMa language model fine-tuned with thestandard supervised loss on only 1,000 carefully curated prompts and responses,without any reinforcement learning or human preference modeling. LIMAdemonstrates remarkably strong performance, learning to follow specificresponse formats from only a handful of examples in the training data,including complex queries that range from planning trip itineraries tospeculating about alternate history. Moreover, the model tends to generalizewell to unseen tasks that did not appear in the training data. In a controlledhuman study, responses from LIMA are either equivalent or strictly preferred toGPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bardand 65% versus DaVinci003, which was trained with human feedback. Takentogether, these results strongly suggest that almost all knowledge in largelanguage models is learned during pretraining, and only limited instructiontuning data is necessary to teach models to produce high quality output.

Quick Read (beta)

loading the full paper ...