RLHFlow

university

RLHFlow

Workflow of Reinforcement Learning from Human Feedback (RLHF). Blog: https://rlhflow.github.io/

baohao updated a collection about 2 months ago

baohao updated a collection about 2 months ago

baohao updated a model about 2 months ago

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

RLHFlow 's Papers 1

Submitted by

Wei Xiong