KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 3 days ago • 46
mlx-community/NVIDIA-Nemotron-3-Nano-4B-OptiQ-4bit Text Generation • 0.8B • Updated about 17 hours ago • 101
mlx-community/NVIDIA-Nemotron-3-Nano-4B-OptiQ-4bit Text Generation • 0.8B • Updated about 17 hours ago • 101