MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 7 days ago • 88
Running on CPU Upgrade Featured 3.21k The Smol Training Playbook 📚 3.21k The secrets to building world-class LLMs
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 87
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Paper • 2505.16990 • Published May 22, 2025 • 22
REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration Paper • 2510.01879 • Published Oct 2, 2025 • 8
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning? Paper • 2510.06036 • Published Oct 7, 2025 • 7
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning? Paper • 2510.06036 • Published Oct 7, 2025 • 7 • 2