Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 4 days ago • 137
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 4 days ago • 205
Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning Paper • 2605.06326 • Published 10 days ago • 24
Diversity-Incentivized Exploration for Versatile Reasoning Paper • 2509.26209 • Published Sep 30, 2025 • 17
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning Paper • 2509.20712 • Published Sep 25, 2025 • 20