29 3

liyaxuan

lllyx

AI & ML interests

None yet

Recent Activity

updated a collection 8 days ago

Rethinking OPD

updated a model 8 days ago

lllyx/Qwen3-1.7B-Base-OPD

published a model 8 days ago

lllyx/Qwen3-1.7B-Base-OPD

View all activity

Organizations

None yet

updated a collection 8 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3

updated a model 8 days ago

lllyx/Qwen3-1.7B-Base-OPD

Text Generation • 2B • Updated 8 days ago • 71

published a model 8 days ago

lllyx/Qwen3-1.7B-Base-OPD

Text Generation • 2B • Updated 8 days ago • 71

upvoted 2 papers 8 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Paper • 2605.29343 • Published 14 days ago • 33

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 62

upvoted a paper 14 days ago

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published May 8 • 41

liked a model 16 days ago

openbmb/MiniCPM5-1B

Text Generation • 1B • Updated 16 days ago • 137k • 795

upvoted a paper 23 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 24 days ago • 30

upvoted a paper 24 days ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 28 days ago • 111

upvoted a paper 27 days ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published 29 days ago • 159

upvoted a paper 28 days ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published 29 days ago • 219

updated a collection 30 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3

updated a dataset 30 days ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 30 days ago • 305k • 195 • 2

updated a model 30 days ago

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 30 days ago • 556 • 4

published a dataset 30 days ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 30 days ago • 305k • 195 • 2

upvoted a collection about 1 month ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 8 days ago • 3

upvoted 4 papers about 1 month ago

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper • 2605.08083 • Published May 8 • 69

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Paper • 2604.28123 • Published May 1 • 49

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published Apr 30 • 78

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Paper • 2605.06130 • Published May 7 • 112

liyaxuan

AI & ML interests

Recent Activity

Organizations

lllyx's activity