arxiv:2502.09100
Zhizhang Fu
HarryFu
·
AI & ML interests
None yet
Organizations
None yet
models 11
HarryFu/Qwen2.5-1.5B-GRPO-LoG
Text Generation • 2B • Updated
HarryFu/Qwen2.5-3B-GRPO-0819
Updated
HarryFu/Qwen2.5-3B-GRPO-600-fzz
3B • Updated • 1
HarryFu/Qwen2.5-3B-GRPO-0725
3B • Updated
HarryFu/Qwen2.5-3B-SFT-GRPO-0725
3B • Updated
HarryFu/Qwen2.5-3B-SFT-GRPO-0724
Updated
HarryFu/Qwen2.5-3B-SFT-GRPO
3B • Updated • 1
HarryFu/Qwen2.5-3B-GRPO
3B • Updated • 2
HarryFu/Qwen2.5-3B-Distill
3B • Updated • 3
HarryFu/Qwen2.5-3B-Distill-GRPO
Updated
datasets 0
None public yet