Running 117 The Eiffel Tower Llama 📝 117 Explore the Eiffel Tower Llama experiment with open-source models
Running 114 Unlocking On-Policy Distillation for Any Model Family 📝 114 Explore on-policy distillation visualization for any model
Running Featured 86 Distilling 100B+ Models 40x Faster with TRL 📝 86 TRL distillation for 100B+ teachers, 40x faster
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 247
Running on CPU Upgrade Featured 3.21k The Smol Training Playbook 📚 3.21k The secrets to building world-class LLMs
Running 3.89k The Ultra-Scale Playbook 🌌 3.89k The ultimate guide to training LLM on large GPU Clusters