Aman Singh Thakur's picture

Aman Singh Thakur

singh96aman

·

AI & ML interests

Responsible AI and Classical Machine Learning

Recent Activity

upvoted an article 1 day ago

𝗝𝘂𝗱𝗴𝗶𝗻𝗴 𝘁𝗵𝗲 𝗝𝘂𝗱𝗴𝗲𝘀: 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗻𝗴 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 𝗮𝗻𝗱 𝗩𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝗶𝗲𝘀 𝗶𝗻 𝗟𝗟𝗠𝘀-𝗮𝘀-𝗝𝘂𝗱𝗴𝗲𝘀

commentedon a paper almost 2 years ago

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

reacted to theirpost with 🔥 almost 2 years ago

𝗝𝘂𝗱𝗴𝗶𝗻𝗴 𝘁𝗵𝗲 𝗝𝘂𝗱𝗴𝗲𝘀: 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗻𝗴 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 𝗮𝗻𝗱 𝗩𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝗶𝗲𝘀 𝗶𝗻 𝗟𝗟𝗠𝘀-𝗮𝘀-𝗝𝘂𝗱𝗴𝗲𝘀 https://huggingface.co/papers/2406.12624 𝐂𝐚𝐧 𝐋𝐋𝐌𝐬 𝐬𝐞𝐫𝐯𝐞 𝐚𝐬 𝐫𝐞𝐥𝐢𝐚𝐛𝐥𝐞 𝐣𝐮𝐝𝐠𝐞𝐬 ⚖️? We aim to identify the right metrics for evaluating Judge LLMs and understand their sensitivities to prompt guidelines, engineering, and specificity. With this paper, we want to raise caution ⚠️ to blindly using LLMs as human proxy. Blog - https://huggingface.co/blog/singh96aman/judgingthejudges Arxiv - https://arxiv.org/abs/2406.12624 Tweet - https://x.com/iamsingh96aman/status/1804148173008703509 @singh96aman @kartik727 @Srinik-1 @sankaranv @dieuwkehupkes

View all activity

Organizations

None yet

published an article almost 2 years ago

Article

𝗝𝘂𝗱𝗴𝗶𝗻𝗴 𝘁𝗵𝗲 𝗝𝘂𝗱𝗴𝗲𝘀: 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗻𝗴 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 𝗮𝗻𝗱 𝗩𝘂𝗹𝗻𝗲𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝗶𝗲𝘀 𝗶𝗻 𝗟𝗟𝗠𝘀-𝗮𝘀-𝗝𝘂𝗱𝗴𝗲𝘀

singh96aman

•

Jun 24, 2024

• 2