Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models Paper • 2510.13394 • Published Oct 15, 2025 • 1
PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors Paper • 2605.06455 • Published May 7 • 3
So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection Paper • 2505.18660 • Published May 24, 2025 • 2