LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Abstract

We present LLaMA-Adapter, a lightweight adaption method to efficientlyfine-tune LLaMA into an instruction-following model. Using 52K self-instructdemonstrations, LLaMA-Adapter only introduces 1.2M learnable parameters uponthe frozen LLaMA 7B model, and costs less than one hour for fine-tuning on 8A100 GPUs. Specifically, we adopt a set of learnable adaption prompts, andprepend them to the input text tokens at higher transformer layers. Then, azero-init attention mechanism with zero gating is proposed, which adaptivelyinjects the new instructional cues into LLaMA, while effectively preserves itspre-trained knowledge. With efficient training, LLaMA-Adapter generateshigh-quality responses, comparable to Alpaca with fully fine-tuned 7Bparameters. Furthermore, our approach can be simply extended to multi-modalinput, e.g., images, for image-conditioned LLaMA, which achieves superiorreasoning capacity on ScienceQA. We release our code athttps://github.com/ZrrSkywalker/LLaMA-Adapter.

Quick Read (beta)

loading the full paper ...