A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models
Paper • 2605.08504 • Published • 3
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
This page is the model of WeMask, the github link is ME-Layer. We use Qwen-3-VL-4B as our foundation model for SFT and RL training. You can follow the guideline in this repo to start testing and training.
If you think our research is helpful, please cite with
@article{me_layer_2026,
title={A Single Layer to Explain Them All: Understanding Massive Values in Large Language Models},
author={Your Name and Co-authors},
journal={Proceedings of the 43rd International Conference on Machine Learning (ICML)},
year={2026}
}