Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Training of large-scale models is both computationally intensive and often constrained by the availability of labeled data. Model merging offers a compelling alternative by directly integrating the weights of multiple source models without requiring additional data or extensive training. However, conventional model merging techniques, such as parameter averaging, suffer from unintentional merging of non-generalizable features, especially in non-IID scenarios where source models exhibit significant weight disparities. Alternatively, the model ensembling technique typically provides more stable and superior performance that aggregates multiple models by averaging outputs. However, it incurs higher inference costs and increased storage requirements. Previous studies showed the similarities between model merging and ensembling experimentally, but there is a lack of theoretical evidence and evaluation metrics. To bridge this gap, we introduce M-loss, a novel evaluation metric that quantifies the compatibility of merging source models using only unlabeled data. By measuring the discrepancy between parameter averaging and model ensembling at both layer and node levels, M-loss facilitates more effective merging strategies. Specifically, M-loss serves as a quantitative criterion showing the theoretical feasibility of model merging, and a guide for parameter significance in model pruning strategies. Our theoretical analysis and empirical evaluations demonstrate that incorporating M-loss into the merging process significantly improves the alignment between merged models and model ensembling, offering a scalable and efficient framework for accurate model consolidation.