Love seeing the Esper line keep shipping in the open. Curious how Esper 4 holds its general capabilities after stacking Titanium + Mitakihara + Tachibana — specializing that hard is usually where sequential fine-tunes start drifting on everything else. That retention problem is what we've been heads-down on at ModelBrew (modelbrew.ai); the multi-set diversity you're using is a real part of what helps. Grabbing these, thanks for open-sourcing.
Kiran N PRO
Fourwheels2512
AI & ML interests
None yet
Recent Activity
commentedon a paper 3 days ago
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of
Methods and Metrics commentedon a paper 3 days ago
GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models commentedon a paper 3 days ago
Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning