Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 10 days ago • 112
AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery Paper • 2604.25256 • Published Apr 28 • 30
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling Paper • 2509.23909 • Published Sep 28, 2025 • 34