This collection contains curriculum-RLed Olmo models.
SeanWang0027 PRO
SeanWang0027
AI & ML interests
LLM Post-Training
Recent Activity
published a dataset 1 day ago
CL-From-Nothing/rose_code updated a dataset 1 day ago
CL-From-Nothing/rose_code published a dataset 3 days ago
CL-From-Nothing/rlve_rose_initialOrganizations
Continual-SFT-Olmo
This contains the SFT-ed Olmo models, and some models built upon.
-
SeanWang0027/olmo-7b-synlogic-survo-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1 -
SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1
Curriculum-RL
This collection contains curriculum-RLed Olmo models.
Continual-SFT-Olmo
This contains the SFT-ed Olmo models, and some models built upon.
-
SeanWang0027/olmo-7b-synlogic-survo-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft
7B • Updated • 1 -
SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1 -
SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft
7B • Updated • 1
models 57
SeanWang0027/rl_warm_up_physics_1K_ROSE-parquet_qwen3-8b_epoch_3_mask
8B • Updated • 17
SeanWang0027/polaris_warmup_polaris
Updated
SeanWang0027/polaris_warmup_polaris_ROSE_warmup_Qwen3_1_7B_40K-parquet_qwen3-1.7b_epoch_1_mask
2B • Updated • 19
SeanWang0027/token_reward_direct_math_hard_509
Updated
SeanWang0027/rl_warm_up_0519
Updated
SeanWang0027/polaris_warmup_polaris_offline_40K-parquet_qwen3-1.7b_epoch_1_mask
Updated
SeanWang0027/polaris_warmup_polaris_warmup_40K-parquet_qwen3-4b_epoch_1_mask
4B • Updated • 13
SeanWang0027/token_reward_direct
Updated
SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask_k4096
Updated
SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask
Updated
datasets 40
SeanWang0027/physics_hard_questions
Updated • 27
SeanWang0027/polaris_hard_POPE
Updated • 19
SeanWang0027/polaris_hard_sampling
Viewer • Updated • 15.4k • 32
SeanWang0027/polaris_Qwen3-1_7B_prefix_4K
Viewer • Updated • 14.4k • 34
SeanWang0027/polaris_pope_prefix_40K
Viewer • Updated • 40k • 33
SeanWang0027/polaris_offline_40K
Viewer • Updated • 40k • 34
SeanWang0027/polaris_warmup_40K
Viewer • Updated • 40k • 38
SeanWang0027/polaris_hard
Viewer • Updated • 15.4k • 49
SeanWang0027/math_pope_mix_1018
Viewer • Updated • 1.02k • 25
SeanWang0027/sft_full_math_hard_9000
Viewer • Updated • 9k • 28