Nemotron-Labs-Diffusion Collection A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding • 7 items • Updated 3 days ago • 48
Proven REAPs Collection Benchmarked REAP checkpoints with >=500 all-time downloads. GLM/Qwen/MiniMax/DeepSeek/Kimi/gemma. • 20 items • Updated 7 days ago • 10
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 23 items • Updated 1 day ago • 315
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 3 days ago • 157
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 52 items • Updated 1 day ago • 151
view article Article The Transformers Library: standardizing model definitions +2 lysandre, ArthurZ, pcuenq, julien-c • May 15, 2025 • 123
view article Article Welcome Falcon Mamba: The first strong attention-free 7B model +4 JingweiZuo, yellowvm, DhiyaEddine, IChahed, ybelkada, Gkunsch • Aug 12, 2024 • 113