AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints Paper • 2606.05622 • Published 15 days ago • 41
Advancing Creative Physical Intelligence in Large Multimodal Models Paper • 2605.26396 • Published 25 days ago • 19
Useful Memories Become Faulty When Continuously Updated by LLMs Paper • 2605.12978 • Published May 13 • 19
CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing Paper • 2605.02910 • Published May 6 • 22
PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning Paper • 2601.11957 • Published Jan 28 • 3
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 55
Where LLM Agents Fail and How They can Learn From Failures Paper • 2509.25370 • Published Sep 29, 2025 • 12