\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper β’ 2603.07980 β’ Published Mar 9 β’ 27 β’ 4
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper β’ 2506.08672 β’ Published Jun 10, 2025 β’ 30 β’ 4
LM-Lexicon: Improving Definition Modeling via Harmonizing Semantic Experts Paper β’ 2602.14060 β’ Published Feb 15 β’ 2 β’ 3
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper β’ 2512.07461 β’ Published Dec 8, 2025 β’ 79 β’ 4