X's picture

2

X

Phoebe13

·

AI & ML interests

None yet

Recent Activity

updated a model 12 days ago

Phoebe13/Video-MTR

upvoted a paper 8 months ago

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

authored a paper 9 months ago

Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding

View all activity

Organizations

None yet

updated a model 12 days ago

Phoebe13/Video-MTR

Visual Question Answering • 8B • Updated 12 days ago • 26 • 7

upvoted a paper 8 months ago

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

Paper • 2509.24981 • Published Sep 29, 2025 • 29

authored a paper 9 months ago

Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding

Paper • 2508.20478 • Published Aug 28, 2025 • 18

upvoted a paper 9 months ago

Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding

Paper • 2508.20478 • Published Aug 28, 2025 • 18

published a model 10 months ago

Phoebe13/Video-MTR

Visual Question Answering • 8B • Updated 12 days ago • 26 • 7

published 12 models about 1 year ago

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_v1.2_ev_handcomp_simple_with_handtype

Updated Apr 20, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_v1.2_ev_handcomp_simple

Updated Apr 8, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.5_30k_stage234_ev_handcomp_simple

Updated Apr 8, 2025

Phoebe13/Qwen-2.5-7B-Instruct_Explore0.25_12k_stage234_ev_handcomp_simple

Updated Apr 7, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-30k_stage234_ev-by-handcomp-simple

Updated Mar 31, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-30k_stage1234_ev-by-handcomp-simple

Updated Mar 31, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-16k_stage234_ev-by-handcomp-simple

Updated Mar 30, 2025

Phoebe13/Qwen-2.5-7B-Instruct-Poker-ev-by-handcomp-simple

Updated Mar 30, 2025

Phoebe13/Qwen-2.5-7B-Poker-RL-StrictFormat-ev-by-handcomp-simple

Updated Mar 28, 2025

Phoebe13/Qwen-2.5-7B-Poker-RL-StrictFormat-ev-by-handcomp

Updated Mar 28, 2025

Phoebe13/Qwen-2.5-7B-Poker-RL-StrictFormat

Updated Mar 24, 2025

Phoebe13/Qwen-2.5-7B-Poker-RL

Updated Mar 24, 2025

published 2 models over 1 year ago

Phoebe13/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

Updated Feb 20, 2025

Phoebe13/Qwen-2.5-7B-Simple-RL

Updated Feb 20, 2025