Running 3.87k The Ultra-Scale Playbook 🌌 3.87k The ultimate guide to training LLM on large GPU Clusters
principled-intelligence/gemma-4-E2B-it-text-only Feature Extraction • 5B • Updated Apr 3 • 2.6k • 6
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 725
microsoft/Phi-3-mini-128k-instruct Text Generation • 4B • Updated Dec 10, 2025 • 248k • • 1.7k
meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.47M • • 4.58k
CohereLabs/c4ai-command-r-plus-4bit Text Generation • 105B • Updated Apr 16, 2025 • 3.89k • 261
mistralai/Mistral-7B-Instruct-v0.2 Text Generation • 7B • Updated Jul 24, 2025 • 1.88M • • 3.16k
TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ Text Generation • 47B • Updated Dec 14, 2023 • 5.43k • 141