Papers
arxiv:2505.14009

Activation-Guided Consensus Merging for Large Language Models

Published on Nov 14, 2025
Authors:
,
,
,
,
,
,
,
,

Abstract

Activation-Guided Consensus Merging (ACM) is a plug-and-play framework that determines layer-specific merging coefficients based on activation mutual information, effectively preserving task-specific capabilities without requiring additional training or gradient computations.

AI-generated summary

Recent research has increasingly focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. While existing training-based and prompt-based approaches face significant challenges in terms of efficiency and stability, model merging emerges as a promising strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model. However, conventional model merging methods often assume uniform importance across layers, overlooking the functional heterogeneity inherent in neural components. To address this limitation, we propose Activation-Guided Consensus Merging (ACM), a plug-and-play merging framework that determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models. ACM effectively preserves task-specific capabilities without requiring gradient computations or additional training. Extensive experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods. For instance, in the case of Qwen-7B models, TIES-Merging equipped with ACM achieves a 55.3\% reduction in response length while simultaneously improving reasoning accuracy by 1.3 points.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2505.14009
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.14009 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.14009 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.14009 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.