KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 5 days ago • 53 • 10
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 5 days ago • 53 • 10
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 5 days ago • 53 • 10
Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models Paper • 2605.26895 • Published 12 days ago • 20 • 3
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond Paper • 2605.19660 • Published 19 days ago • 40 • 3
$δ$-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 26 days ago • 125 • 5
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 26 days ago • 195 • 4
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 25 days ago • 159 • 4
Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion Paper • 2605.12825 • Published 26 days ago • 12 • 2
Reliable Chain-of-Thought via Prefix Consistency Paper • 2605.07654 • Published about 1 month ago • 1 • 3
Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs Paper • 2605.12460 • Published 26 days ago • 17 • 2
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks Paper • 2605.10977 • Published 29 days ago • 10 • 2
LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models Paper • 2605.11011 • Published 28 days ago • 9 • 2
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States Paper • 2605.07579 • Published about 1 month ago • 18 • 3
$δ$-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 26 days ago • 125 • 5
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting Paper • 2605.07243 • Published about 1 month ago • 4 • 3