SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models

Abstract

With the rapid development of large language models (LLMs), fully fine-tuning(FT) these models has become increasingly impractical due to the highcomputational demands. Additionally, FT can lead to catastrophic forgetting. Asan alternative, Low-Rank Adaptation (LoRA) has been proposed, which fine-tunesonly a small subset of parameters, achieving similar performance to FT whilesignificantly reducing resource requirements. However, since LoRA inherits FT'sdesign, the issue of catastrophic forgetting remains. To address these challenges, we propose SECURA: Sigmoid-Enhanced CURDecomposition LoRA, a novel parameter-efficient fine-tuning (PEFT) variant thatmitigates catastrophic forgetting while improving fine-tuning performance. Ourmethod introduces a new normalization technique, SigNorm, to enhance parameterretention and overall performance. SECURA has been evaluated on a variety of tasks, including mathematicalproblem-solving (GSM8K), challenging question-answering (CNNDM), translation(NewsDE), and complex multiple-choice reasoning (LogiQA). Experimental resultsshow that SECURA achieves an average fine-tuning improvement of 3.59% acrossfour multiple-choice question (MCQ) tasks and a 2.51% improvement across fivequestion-answering (QA) tasks on models such as Gemma2 2b, Qwen2 1.5b, Qwen 27b, Llama3 8b, and Llama3.1 8b, compared to DoRA. Moreover, SECURA demonstratessuperior knowledge retention capabilities, maintaining more than 70% accuracyon basic LLM knowledge across 16 continual learning tests, outperformingExperience Replay (ER), Sequential Learning (SEQ), EWC, I-LoRA, and CUR-LoRA.

Quick Read (beta)

loading the full paper ...