Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3
Draft-OPD: On-Policy Distillation for Speculative Draft Models Paper • 2605.29343 • Published 14 days ago • 33
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16, 2025 • 62
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 24 days ago • 30
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 29 days ago • 159
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 29 days ago • 219
Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3
Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published May 8 • 69
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL Paper • 2604.28123 • Published May 1 • 49
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published Apr 30 • 78
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published May 7 • 112