No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces
Abstract
Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper, we investigate the key characteristics of task matrices -- weight update matrices applied to a pre-trained model -- that enable effective merging. We show that alignment between singular components of task-specific and merged matrices strongly correlates with performance improvement over the pre-trained model. Based on this, we propose an isotropic merging framework that flattens the singular value spectrum of task matrices, enhances alignment, and reduces the performance gap. Additionally, we incorporate both common and task-specific subspaces to further improve alignment and performance. Our proposed approach achieves state-of-the-art performance across multiple scenarios, including various sets of tasks and model scales. This work advances the understanding of model merging dynamics, offering an effective methodology to merge models without requiring additional training. Code is available at https://github.com/danielm1405/iso-merging .
Community
We investigate the key characteristics of task matrices - weight update matrices applied to a pre-trained model - that enable effective model merging. We show that alignment between singular components of task-specific and merged matrices strongly correlates with performance improvement over the pre-trained model. Based on this, we propose an isotropic merging framework that flattens the singular value spectrum of task matrices, enhances alignment, and reduces the performance gap. Additionally, we incorporate both common and task-specific subspaces to further improve alignment and performance. Our proposed approach achieves state-of-the-art performance across multiple scenarios, including various sets of tasks and model scales.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Tint Your Models Task-wise for Improved Multi-task Model Merging (2024)
- Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent (2025)
- Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging (2025)
- Parameter-Efficient Interventions for Enhanced Model Merging (2024)
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment (2024)
- Adapter Merging with Centroid Prototype Mapping for Scalable Class-Incremental Learning (2024)
- Training-free Heterogeneous Model Merging (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper