SaaSBench: Exploring the Boundaries of Coding Agents in Long-Horizon Enterprise SaaS Engineering Paper • 2605.17526 • Published 22 days ago • 7
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published 18 days ago • 177
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published 21 days ago • 92
MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Paper • 2512.18181 • Published May 7 • 86
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2
SpatialGenEval Collection [ICLR 2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models • 1 item • Updated May 7
VGPO-RL Collection [ACL 2026] Visually-Guided Policy Optimization for Multimodal Reasoning • 3 items • Updated May 7
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 189
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics Paper • 2604.17295 • Published Apr 19 • 84
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published Apr 17 • 73
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2