GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding Paper • 2605.15250 • Published May 14 • 13
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published Mar 30 • 44
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published Mar 30 • 44