LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 4 days ago • 142
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection Paper • 2605.30288 • Published 22 days ago • 23
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published Mar 17 • 312
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 109
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony Paper • 2510.11345 • Published Oct 13, 2025 • 17
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 110
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published Jan 2, 2025 • 26
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models Paper • 2410.11710 • Published Oct 15, 2024 • 20