Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models

Abstract

Pre-trained language models (PLMs) are known to be overly parameterized andhave significant redundancy, indicating a small degree of freedom of the PLMs.Motivated by the observation, in this paper, we study the problem ofre-parameterizing and fine-tuning PLMs from a new perspective: Discovery ofintrinsic task-specific subspace. Specifically, by exploiting the dynamics ofthe fine-tuning process for a given task, the parameter optimization trajectoryis learned to uncover its intrinsic task-specific subspace. A key finding isthat PLMs can be effectively fine-tuned in the subspace with a small number offree parameters. Beyond, we observe some outlier dimensions emerging duringfine-tuning in the subspace. Disabling these dimensions degrades the modelperformance significantly. This suggests that these dimensions are crucial toinduce task-specific knowledge to downstream tasks.

Quick Read (beta)

loading the full paper ...