LifelongAlignment/aifgen-long-piecewise
Viewer
•
Updated
•
1
•
27
Synthetic Preference Datasets for Continual Reinforcement Learning from Human Feedback - https://github.com/ComplexData-MILA/AIF-Gen