arxiv:2504.04823
SunYuxuan
snowdusky
AI & ML interests
None yet
Recent Activity
upvoted a paper 29 days ago
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation upvoted a paper about 1 month ago
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression upvoted a paper 3 months ago
Scaling Embeddings Outperforms Scaling Experts in Language ModelsOrganizations
None yet