cj453/dense_reward_trainer_final_opt__NumTrainEpochs2_SaveStrategiesepoch_reward_modeling_anthropic_hh Updated Sep 14
cj453/dense_reward_trainer_final_opt__NumTrainEpochs2_SaveStrategiesno_reward_modeling_anthropic_hh Updated Sep 15
cj453/dense_reward_trainer_final_opt__NumTrainEpochs5_SaveStrategiesepoch_reward_modeling_anthropic_hh Updated Sep 16 • 2
cj453/dense_reward_trainer_final_opt__NumTrainEpochs5_SaveStrategiesno_reward_modeling_anthropic_hh Updated Sep 16