view article Article Exploring Direct Tensor Manipulation in Language Models: A Case Study in Binary-Level Model Enhancement TensorSlay • Nov 7, 2025 • 4
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.12k
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch AviSoori1x • May 7, 2024 • 119