Parameter-Efficient Fine-Tuning of State Space Models

Abstract

Deep State Space Models (SSMs), such as Mamba (Gu & Dao, 2024), have emergedas powerful tools for language modeling, offering high performance withefficient inference and linear scaling in sequence length. However, theapplication of parameter-efficient fine-tuning (PEFT) methods to SSM-basedmodels remains largely unexplored. This paper aims to systematically study twokey questions: (i) How do existing PEFT methods perform on SSM-based models?(ii) Which modules are most effective for fine-tuning? We conduct an empiricalbenchmark of four basic PEFT methods on SSM-based models. Our findings revealthat prompt-based methods (e.g., prefix-tuning) are no longer effective, anempirical result further supported by theoretical analysis. In contrast, LoRAremains effective for SSM-based models. We further investigate the optimalapplication of LoRA within these models, demonstrating both theoretically andexperimentally that applying LoRA to linear projection matrices withoutmodifying SSM modules yields the best results, as LoRA is not effective attuning SSM modules. To further improve performance, we introduce LoRA withSelective Dimension tuning (SDLoRA), which selectively updates certain channelsand states on SSM modules while applying LoRA to linear projection matrices.Extensive experimental results show that this approach outperforms standardLoRA.

Quick Read (beta)

loading the full paper ...