Abstract
Large pre-trained models achieve remarkable performance in vision tasks butare impractical for fine-tuning due to high computational and storage costs.Parameter-Efficient Fine-Tuning (PEFT) methods mitigate this issue by updatingonly a subset of parameters; however, most existing approaches aretask-agnostic, failing to fully exploit task-specific adaptations, which leadsto suboptimal efficiency and performance. To address this limitation, wepropose Task-Relevant Parameter and Token Selection (TR-PTS), a task-drivenframework that enhances both computational efficiency and accuracy.Specifically, we introduce Task-Relevant Parameter Selection, which utilizesthe Fisher Information Matrix (FIM) to identify and fine-tune only the mostinformative parameters in a layer-wise manner, while keeping the remainingparameters frozen. Simultaneously, Task-Relevant Token Selection dynamicallypreserves the most informative tokens and merges redundant ones, reducingcomputational overhead. By jointly optimizing parameters and tokens, TR-PTSenables the model to concentrate on task-discriminative information. Weevaluate TR-PTS on benchmark, including FGVC and VTAB-1k, where it achievesstate-of-the-art performance, surpassing full fine-tuning by 3.40% and 10.35%,respectively. The code are available at https://github.com/synbol/TR-PTS.