Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ypwang61 's Collections
One-Shot RLVR

One-Shot RLVR

updated 30 days ago

Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example"

Upvote
1

  • Reinforcement Learning for Reasoning in Large Language Models with One Training Example

    Paper • 2504.20571 • Published Apr 29 • 96

  • ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1

    Text Generation • 2B • Updated May 19 • 1.09k

  • ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13

    Text Generation • 2B • Updated May 19 • 1.09k

  • ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1_pi13

    Text Generation • 2B • Updated May 19 • 108

  • ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-pi1

    Text Generation • 8B • Updated May 19 • 31

  • ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-pi1_pi13

    Text Generation • 8B • Updated May 19 • 30

  • ypwang61/One-Shot-RLVR-Qwen2.5-7B-pi1

    8B • Updated Jun 8 • 63

  • ypwang61/One-Shot-RLVR-Qwen2.5-7B-1.2k-dsr-sub

    8B • Updated Jun 8 • 21

  • ypwang61/One-Shot-RLVR-R1-Distill-1.5B-pi1

    2B • Updated Jun 3 • 27

  • ypwang61/One-Shot-RLVR-R1-Distill-1.5B-4-shot

    2B • Updated Jun 3 • 21

  • ypwang61/One-Shot-RLVR-R1-Distill-1.5B-16-shot

    2B • Updated Jun 3 • 31

  • ypwang61/One-Shot-RLVR-R1-Distill-1.5B-1.2k-dsr-sub

    2B • Updated Jun 3 • 23

  • ypwang61/One-Shot-RLVR-Datasets

    Viewer • Updated May 19 • 1.98k • 111 • 3
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs