arxiv:2605.28556
Yotam Perlitz
per
AI & ML interests
None yet
Recent Activity
authored a paper 2 days ago
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards
Meaningful LLM Evaluation authored a paper 2 days ago
CLEAR: Error Analysis via LLM-as-a-Judge Made Easy authored a paper 2 days ago
General Agent Evaluation