Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yapeichang 's Collections
BLEUBERI

BLEUBERI

updated Jun 6

This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".

Upvote
-

  • BLEUBERI: BLEU is a surprisingly effective reward for instruction following

    Paper • 2505.11080 • Published May 16 • 5

  • yapeichang/BLEUBERI-Tulu3-50k

    Viewer • Updated Jun 9 • 50k • 31 • 1

  • yapeichang/Qwen2.5-7B-BLEUBERI

    Text Generation • Updated Jun 17 • 184 • 1

  • yapeichang/Qwen2.5-7B-RM8B

    Text Generation • Updated Jun 5 • 5

  • yapeichang/Qwen2.5-7B-SFT

    Text Generation • Updated Jun 5 • 3

  • yapeichang/Qwen2.5-3B-BLEUBERI

    Text Generation • Updated Jun 5 • 3

  • yapeichang/Qwen2.5-3B-RM8B

    Text Generation • Updated Jun 17 • 10

  • yapeichang/Qwen2.5-3B-SFT

    Text Generation • Updated Jun 5 • 3

  • yapeichang/Llama-3.1-8B

    Text Generation • Updated Jun 5 • 3

  • yapeichang/Llama-3.1-8B-BLEUBERI

    Text Generation • Updated Jun 5 • 4

  • yapeichang/Llama-3.1-8B-RM8B

    Text Generation • Updated Jun 5 • 4

  • yapeichang/Llama-3.1-8B-SFT

    Text Generation • Updated Jun 17 • 4
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs