SubTuning: Efficient Finetuning for Multi-Task Learning

Abstract

Finetuning a pretrained model has become a standard approach for trainingneural networks on novel tasks, resulting in fast convergence and improvedperformance. In this work, we study an alternative finetuning method, whereinstead of finetuning all the weights of the network, we only train a carefullychosen subset of layers, keeping the rest of the weights frozen at theirinitial (pretrained) values. We demonstrate that \emph{subset finetuning} (orSubTuning) often achieves accuracy comparable to full finetuning of the model,and even surpasses the performance of full finetuning when training data isscarce. Therefore, SubTuning allows deploying new tasks at minimalcomputational cost, while enjoying the benefits of finetuning the entire model.This yields a simple and effective method for multi-task learning, wheredifferent tasks do not interfere with one another, and yet share most of theresources at inference time. We demonstrate the efficiency of SubTuning acrossmultiple tasks, using different network architectures and pretraining methods.

Quick Read (beta)

loading the full paper ...