Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 15 days ago • 121
LVSA: Training-Free Sparse Attention for Long Video Diffusion Paper • 2605.31057 • Published 20 days ago • 14
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 28 days ago • 170
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 249