Abstract
Different from existing federated fine-tuning (FFT) methods for foundationmodels, hybrid heterogeneous federated fine-tuning (HHFFT) is an under-exploredscenario where clients exhibit double heterogeneity in model architectures anddownstream tasks. This hybrid heterogeneity introduces two significantchallenges: 1) heterogeneous matrix aggregation, where clients adopt differentlarge-scale foundation models based on their task requirements and resourcelimitations, leading to dimensional mismatches during LoRA parameteraggregation; and 2) multi-task knowledge interference, where local sharedparameters, trained with both task-shared and task-specific knowledge, cannotensure only task-shared knowledge is transferred between clients. To addressthese challenges, we propose H2Tune, a federated foundation model fine-tuningwith hybrid heterogeneity. Our framework H2Tune consists of three keycomponents: (i) sparsified triple matrix decomposition to align hiddendimensions across clients through constructing rank-consistent middle matrices,with adaptive sparsification based on client resources; (ii) relation-guidedmatrix layer alignment to handle heterogeneous layer structures andrepresentation capabilities; and (iii) alternating task-knowledgedisentanglement mechanism to decouple shared and specific knowledge of localmodel parameters through alternating optimization. Theoretical analysis provesa convergence rate of O(1/\sqrt{T}). Extensive experiments show our methodachieves up to 15.4% accuracy improvement compared to state-of-the-artbaselines. Our code is available athttps://anonymous.4open.science/r/H2Tune-1407.