InterleaveThinker: Reinforcing Agentic Interleaved Generation Paper • 2606.13679 • Published 7 days ago • 79
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 7 days ago • 89
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 9 days ago • 41
Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents Paper • 2605.10832 • Published May 11 • 22
4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding Paper • 2605.05997 • Published May 7 • 18
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 103
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 103