argilla/ultrafeedback-binarized-avg-rating-for-dpo-filtered
Viewer
•
Updated
•
57.7k
•
8
LLMs, NLP, Alignment, DPO, RLHF, data labeling, text-classification, text-generation, token-classification