GigaWorld-Policy: An Efficient Action-Centered World--Action Model Paper • 2603.17240 • Published Mar 18 • 26
VisionThink Collection Efficient Reasoning Vision Language Model • 7 items • Updated Jul 18, 2025 • 7
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 216
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 213