Deng Benyong

Watcher12

4

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

upvoted a paper 27 days ago

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

upvoted a paper 5 months ago

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

View all activity

Organizations

None yet

upvoted a paper 9 days ago

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

Paper • 2606.22388 • Published 11 days ago • 96

upvoted a paper 27 days ago

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Paper • 2606.05622 • Published 28 days ago • 44

upvoted a paper 5 months ago

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published Jan 16 • 31

upvoted a paper 8 months ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 23