Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Abstract

Model merging is an efficient empowerment technique in the machine learningcommunity that does not require the collection of raw training data and doesnot require expensive computation. As model merging becomes increasinglyprevalent across various fields, it is crucial to understand the availablemodel merging techniques comprehensively. However, there is a significant gapin the literature regarding a systematic and thorough review of thesetechniques. This survey provides a comprehensive overview of model mergingmethods and theories, their applications in various domains and settings, andfuture research directions. Specifically, we first propose a new taxonomicapproach that exhaustively discusses existing model merging methods. Secondly,we discuss the application of model merging techniques in large languagemodels, multimodal large language models, and 10+ machine learning subfields,including continual learning, multi-task learning, few-shot learning, etc.Finally, we highlight the remaining challenges of model merging and discussfuture research directions. A comprehensive list of papers about model mergingis available at\url{https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications}.

Quick Read (beta)

loading the full paper ...