A Single Layer to Explain Them All: Understanding Massive Values in Large Language Models

Start

This page is the model of WeMask, the github link is ME-Layer. We use Qwen-3-VL-4B as our foundation model for SFT and RL training. You can follow the guideline in this repo to start testing and training.

Citation

If you think our research is helpful, please cite with

@article{me_layer_2026,
  title={A Single Layer to Explain Them All: Understanding Massive Values in Large Language Models},
  author={Your Name and Co-authors},
  journal={Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for DarkBluee/WeMask

A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

Paper • 2605.08504 • Published 5 days ago • 3