Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 2 days ago • 42
JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence Paper • 2606.14777 • Published 9 days ago • 189
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 8 days ago • 101
CoVEBench: Can Video Editing Models Handle Complex Instructions? Paper • 2606.08415 • Published 12 days ago • 48
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Paper • 2606.02060 • Published 18 days ago • 54
SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing Paper • 2604.19587 • Published Apr 21 • 48
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 25 days ago • 38
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Paper • 2605.16257 • Published May 15 • 54
MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents Paper • 2605.09530 • Published May 10 • 148
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 235
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published Apr 20 • 46
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published Apr 20 • 95
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 86
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published Apr 16 • 36
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Paper • 2604.12290 • Published Apr 14 • 16
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Paper • 2604.12290 • Published Apr 14 • 16