Small Models are Valuable Plug-ins for Large Language Models

  • 2023-05-15 18:59:01
  • Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, Julian McAuley
  • 20

Abstract

Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but theirweights are often publicly unavailable and their immense sizes make the modelsdifficult to be tuned with common hardware. As a result, effectively tuningthese models with large-scale supervised data can be challenging. As analternative, In-Context Learning (ICL) can only use a small number ofsupervised examples due to context length limits. In this paper, we proposeSuper In-Context Learning (SuperICL) which allows black-box LLMs to work withlocally fine-tuned smaller models, resulting in superior performance onsupervised tasks. Our experiments demonstrate that SuperICL can improveperformance beyond state-of-the-art fine-tuned models while addressing theinstability problem of in-context learning. Furthermore, SuperICL can enhancethe capabilities of smaller models, such as multilinguality andinterpretability.

 

Quick Read (beta)

loading the full paper ...