OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published 20 days ago • 81
CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition Paper • 2605.19995 • Published 20 days ago • 34
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 18 days ago • 169
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 25 days ago • 145
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 26 days ago • 59
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 26 days ago • 270
Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs Paper • 2605.07447 • Published May 8 • 4