EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation Paper • 2603.18739 • Published Mar 19 • 11
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 221
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 109
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb • May 21, 2025 • 258
view article Article SmolVLM Grows Smaller – Introducing the 256M & 500M Models! +1 andito, mfarre, merve • Jan 23, 2025 • 192
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 212
MUAD: Multiple Uncertainties for Autonomous Driving, a benchmark for multiple uncertainty types and tasks Paper • 2203.01437 • Published Mar 2, 2022 • 1