Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Cerebras
Fireworks
Replicate
fal
Nebius AI Studio
SambaNova
Novita
Hyperbolic
Together AI
HF Inference API
Misc
Reset Misc
grpo
Inference Endpoints
text-generation-inference
AutoTrain Compatible
4-bit precision
Eval Results
8-bit precision
Carbon Emissions
custom_code
text-embeddings-inference
Misc with no match
Merge
Mixture of Experts
Apply filters
Models
3,352
Full-text search
Edit filters
Sort: Trending
Active filters:
grpo
Clear all
lmassaron/gemma-2-2b-it-grpo-gsm8k
Text Generation
•
Updated
Feb 24
•
56
•
1
saishshinde15/TBH.AI_Base_Reasoning
Text Generation
•
Updated
Feb 25
•
51
•
1
xingqiang/Llama3.1-8B-GRPO-Planing
Text Generation
•
Updated
Feb 21
•
5
•
1
T1anyu/Qwen2.5-1.5B-Open-R1-GRPO-lora
Text Generation
•
Updated
Feb 21
•
15
•
1
mradermacher/OLMoE-1B-7B-0125-Instruct-grpo-GGUF
Updated
Feb 22
•
32
•
1
mradermacher/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-GGUF
Updated
Feb 22
•
170
•
1
mradermacher/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2-GGUF
Updated
Feb 22
•
190
•
1
Metin/LLaMA-3-8B-GRPO-Finance-Math-TR
Text Generation
•
Updated
Feb 24
•
54
•
6
mlabonne/SmolGRPO-135M
Text Generation
•
Updated
Feb 26
•
60
•
4
valoomba/Rombo-V3.1-32B-Reasoner
Text Generation
•
Updated
Feb 24
•
13
•
1
Creekside/Lia-01
Text Generation
•
Updated
Feb 26
•
6
•
1
Locutusque/Thespis-Llama-3.1-8B
Text Generation
•
Updated
Feb 28
•
79
•
13
mradermacher/Rombo-V3.1-32B-Reasoner-i1-GGUF
Updated
Feb 26
•
91
•
1
saishshinde15/TBH.AI_Vortex
Text Generation
•
Updated
Feb 26
•
10
•
1
mradermacher/Thespis-Llama-3.1-8B-GGUF
Updated
Feb 26
•
109
•
1
mradermacher/Thespis-Llama-3.1-8B-i1-GGUF
Updated
Feb 26
•
366
•
1
mradermacher/Lia-01-GGUF
Updated
Feb 26
•
133
•
1
Aditya0619/Phi3.5_Reasoning_GRPO
Text Generation
•
Updated
Feb 26
•
7
•
1
mradermacher/TBH.AI_Vortex-GGUF
Updated
Feb 26
•
172
•
1
wnj13/Dfyd-R1-Distill-Qwen-1.5B-GRPO
Text Generation
•
Updated
Feb 27
•
9
•
1
Aditya0619/Llama3.2_3B_Reasoning
Text Generation
•
Updated
Feb 27
•
7
•
1
Lyte/QuadConnect2.5-1.5B-v0.1.0b
Text Generation
•
Updated
Feb 28
•
49
•
1
mradermacher/QuadConnect2.5-1.5B-v0.1.0b-GGUF
Updated
Mar 1
•
328
•
1
mradermacher/Thinking-cow-7B-GGUF
Updated
Mar 3
•
124
•
1
linkyfan/Qwen2.5-3b-GPRO
Text Generation
•
Updated
Mar 8
•
5
•
1
Azzedde/llama3.1-8b-reasoning-grpo
Text Generation
•
Updated
Mar 3
•
8
•
1
QuantFactory/Thespis-Llama-3.1-8B-GGUF
Text Generation
•
Updated
Mar 4
•
366
•
1
ksanjeeb/PipaT1-500M
Updated
Mar 4
•
55
•
2
numinousmuses/Levlex-Math-One-14B
Text Generation
•
Updated
Mar 6
•
8
•
1
mradermacher/BAI-Qwen-2.5-1.5B-reasoning-GGUF
Updated
Mar 5
•
158
•
1
Previous
1
2
3
4
5
...
100
Next