Project of MoE reward model

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

zyhang1998 updated a dataset 18 days ago

MoeReward/combined_rlhf_dataset_grpo_imdb_main_2K

zyhang1998 published a dataset 18 days ago

MoeReward/combined_rlhf_dataset_grpo_imdb_main_2K

zyhang1998 updated a dataset 18 days ago

MoeReward/combined_rlhf_dataset_grpo_metamath_main_2K

View all activity

models 6

MoeReward/rl_checkpoints

MoeReward/lora_checkpoint

MoeReward/reward_lora_qwen_1_5_base

Updated Mar 21 • 1

MoeReward/reward_qwen_1_5

Updated Mar 17 • 2

MoeReward/reward_lora_qwen_1_5

Updated Mar 17 • 1

MoeReward/sft_full_param_qwen_1_5

Updated Mar 16 • 1

datasets 54

MoeReward/combined_rlhf_dataset_grpo_imdb_main_2K

Viewer • Updated 18 days ago • 2k • 38

MoeReward/combined_rlhf_dataset_grpo_metamath_main_2K

Viewer • Updated 18 days ago • 2k • 40

MoeReward/combined_rlhf_dataset_grpo_arc_main_2K

Viewer • Updated 18 days ago • 2k • 35

MoeReward/combined_rlhf_dataset_grpo_nq_main_2K

Viewer • Updated 18 days ago • 2k • 37

MoeReward/combined_rlhf_dataset_grpo_equal_dist_2K

Viewer • Updated 18 days ago • 2k • 35

MoeReward/combined_rlhf_dataset_grpo_imdb_main

Viewer • Updated Apr 1 • 4k • 26

MoeReward/combined_rlhf_dataset_grpo_metamath_main

Viewer • Updated Apr 1 • 4k • 21

MoeReward/combined_rlhf_dataset_grpo_arc_main

Viewer • Updated Apr 1 • 4k • 22

MoeReward/combined_rlhf_dataset_grpo_nq_main

Viewer • Updated Apr 1 • 4k • 63

MoeReward/combined_rlhf_dataset_grpo_equal_dist

Viewer • Updated Apr 1 • 4k • 9