dshin/flan-t5-ppo-user-e-batch-size-64-use-violation Reinforcement Learning • Updated Mar 14, 2023 • 28
vincentmin/opt-125m-eli5-rl-finetune-128-8-8-1.4e-5_ada Reinforcement Learning • Updated Apr 10, 2023
dshin/flan-t5-ppo-user-a-allenai-prosocial-dialog-testing-upload Reinforcement Learning • Updated Apr 12, 2023 • 13
mariosirt/EleutherAI-gpt-neo-125m-detoxified-perspective Reinforcement Learning • Updated Jun 11, 2023 • 15