R-LoRA: Randomized Multi-Head LoRA for Efficient Multi-Task Learning

Abstract

Fine-tuning large language models (LLMs) is computationally expensive, andLow-Rank Adaptation (LoRA) provides a cost-effective solution by approximatingweight updates through low-rank matrices. In real-world scenarios, LLMs arefine-tuned on data from multiple domains to perform tasks across variousfields, embodying multi-task learning (MTL). LoRA often underperforms in suchcomplex scenarios. To enhance LoRA's capability in multi-task learning, wepropose R-LoRA, which incorporates Multi-Head Randomization. Multi-HeadRandomization diversifies the head matrices through Multi-Head Dropout andMulti-Head Random Initialization, enabling more efficient learning oftask-specific features while maintaining shared knowledge representation. Ourapproach not only improves performance in MTL but also reduces GPU memory usageand training time. Experiments show that R-LoRA's gains stem from increaseddiversity in the head matrices, demonstrating its effectiveness for multi-tasklearning. The code is available at https://github.com/jinda-liu/R-LoRA

Quick Read (beta)

loading the full paper ...