BLEUBERI - a yapeichang Collection

yapeichang 's Collections

updated Jun 6

This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".