LUFFY-RL - a Elliott Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Elliott 's Collections

LUFFY-RL

updated 27 days ago

Elliott/LUFFY-Qwen-Math-7B-Zero

Text Generation • Updated Apr 23 • 90 • 1
Elliott/Qwen2.5-Math-7B-16k-think

Text Generation • Updated 29 days ago • 3.9k • 1
Elliott/Openr1-Math-46k-8192

Viewer • Updated Apr 23 • 45.8k • 406 • 1
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 85
Elliott/LUFFY-Qwen-Math-1.5B-Zero

Text Generation • Updated Apr 23 • 1.69k
Elliott/LUFFY-Qwen-Instruct-7B

Text Generation • Updated Apr 23 • 8 • 1
Elliott/Qwen2.5-Math-7B-SFT

Text Generation • Updated May 2 • 14
Elliott/Qwen2.5-Math-7B-SFT-RL

Text Generation • Updated 28 days ago • 8
Elliott/Openr1-Math-48k-Complement

Viewer • Updated 27 days ago • 47.9k • 29

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs