Task Arithmetic Through The Lens Of One-Shot Federated Learning

Abstract

Task Arithmetic is a model merging technique that enables the combination ofmultiple models' capabilities into a single model through simple arithmetic inthe weight space, without the need for additional fine-tuning or access to theoriginal training data. However, the factors that determine the success of TaskArithmetic remain unclear. In this paper, we examine Task Arithmetic formulti-task learning by framing it as a one-shot Federated Learning problem. Wedemonstrate that Task Arithmetic is mathematically equivalent to the commonlyused algorithm in Federated Learning, called Federated Averaging (FedAvg). Byleveraging well-established theoretical results from FedAvg, we identify twokey factors that impact the performance of Task Arithmetic: data heterogeneityand training heterogeneity. To mitigate these challenges, we adapt severalalgorithms from Federated Learning to improve the effectiveness of TaskArithmetic. Our experiments demonstrate that applying these algorithms canoften significantly boost performance of the merged model compared to theoriginal Task Arithmetic approach. This work bridges Task Arithmetic andFederated Learning, offering new theoretical perspectives on Task Arithmeticand improved practical methodologies for model merging.

Quick Read (beta)

loading the full paper ...