arxiv:2605.31170
Federico Torrielli
EvilScript
AI & ML interests
AI Safety & Mechanistic interpretability
Recent Activity
upvoted a paper about 3 hours ago
LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs new activity 3 days ago
aisilab/moltbook-files-new-language-signals:Add paper link, GitHub repository, and task category authored a paper 4 days ago
Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion