Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization Paper • 2605.28109 • Published 20 days ago • 23
Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization Paper • 2605.28109 • Published 20 days ago • 23
Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization Paper • 2605.28109 • Published 20 days ago • 23
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning Paper • 2603.10160 • Published Mar 10 • 26 • 4
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models Paper • 2603.04800 • Published Mar 5 • 25
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models Paper • 2603.04800 • Published Mar 5 • 25
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models Paper • 2603.04800 • Published Mar 5 • 25
D-CORE: Incentivizing Task Decomposition in Large Reasoning Models for Complex Tool Use Paper • 2602.02160 • Published Feb 2 • 14 • 9
Gorilla: Large Language Model Connected with Massive APIs Paper • 2305.15334 • Published May 24, 2023 • 7
D-CORE: Incentivizing Task Decomposition in Large Reasoning Models for Complex Tool Use Paper • 2602.02160 • Published Feb 2 • 14