IMU-1: Sample-Efficient Pre-training of Small Language Models Paper • 2602.02522 • Published Jan 25 • 9
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning Paper • 2605.14386 • Published 28 days ago • 61
Jackrong/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-GGUF Text Generation • 2B • Updated Mar 15 • 22.7k • 167
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 134k • • 2.88k
Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Paper • 2602.05393 • Published Feb 5 • 9
Nanbeige4-3B Technical Report: Exploring the Frontier of Small Language Models Paper • 2512.06266 • Published Dec 6, 2025 • 8
DavidAU/ERNIE-4.5-37B-A3B-Thinking-Brainstorm20x Text Generation • 37B • Updated Sep 17, 2025 • 7 • 5