Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

text-generation-inference

AutoTrain Compatible

4-bit precision

8-bit precision

Carbon Emissions

text-embeddings-inference

Misc with no match

Mixture of Experts

Models

3,352

Full-text search

Active filters: grpo

lmassaron/gemma-2-2b-it-grpo-gsm8k

Text Generation • Updated Feb 24 • 56 • 1

saishshinde15/TBH.AI_Base_Reasoning

Text Generation • Updated Feb 25 • 51 • 1

xingqiang/Llama3.1-8B-GRPO-Planing

Text Generation • Updated Feb 21 • 5 • 1

T1anyu/Qwen2.5-1.5B-Open-R1-GRPO-lora

Text Generation • Updated Feb 21 • 15 • 1

mradermacher/OLMoE-1B-7B-0125-Instruct-grpo-GGUF

Updated Feb 22 • 32 • 1

mradermacher/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-GGUF

Updated Feb 22 • 170 • 1

mradermacher/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2-GGUF

Updated Feb 22 • 190 • 1

Metin/LLaMA-3-8B-GRPO-Finance-Math-TR

Text Generation • Updated Feb 24 • 54 • 6

mlabonne/SmolGRPO-135M

Text Generation • Updated Feb 26 • 60 • 4

valoomba/Rombo-V3.1-32B-Reasoner

Text Generation • Updated Feb 24 • 13 • 1

Creekside/Lia-01

Text Generation • Updated Feb 26 • 6 • 1

Locutusque/Thespis-Llama-3.1-8B

Text Generation • Updated Feb 28 • 79 • 13

mradermacher/Rombo-V3.1-32B-Reasoner-i1-GGUF

Updated Feb 26 • 91 • 1

saishshinde15/TBH.AI_Vortex

Text Generation • Updated Feb 26 • 10 • 1

mradermacher/Thespis-Llama-3.1-8B-GGUF

Updated Feb 26 • 109 • 1

mradermacher/Thespis-Llama-3.1-8B-i1-GGUF

Updated Feb 26 • 366 • 1

mradermacher/Lia-01-GGUF

Updated Feb 26 • 133 • 1

Aditya0619/Phi3.5_Reasoning_GRPO

Text Generation • Updated Feb 26 • 7 • 1

mradermacher/TBH.AI_Vortex-GGUF

Updated Feb 26 • 172 • 1

wnj13/Dfyd-R1-Distill-Qwen-1.5B-GRPO

Text Generation • Updated Feb 27 • 9 • 1

Aditya0619/Llama3.2_3B_Reasoning

Text Generation • Updated Feb 27 • 7 • 1

Lyte/QuadConnect2.5-1.5B-v0.1.0b

Text Generation • Updated Feb 28 • 49 • 1

mradermacher/QuadConnect2.5-1.5B-v0.1.0b-GGUF

Updated Mar 1 • 328 • 1

mradermacher/Thinking-cow-7B-GGUF

Updated Mar 3 • 124 • 1

linkyfan/Qwen2.5-3b-GPRO

Text Generation • Updated Mar 8 • 5 • 1

Azzedde/llama3.1-8b-reasoning-grpo

Text Generation • Updated Mar 3 • 8 • 1

QuantFactory/Thespis-Llama-3.1-8B-GGUF

Text Generation • Updated Mar 4 • 366 • 1

ksanjeeb/PipaT1-500M

Updated Mar 4 • 55 • 2

numinousmuses/Levlex-Math-One-14B

Text Generation • Updated Mar 6 • 8 • 1

mradermacher/BAI-Qwen-2.5-1.5B-reasoning-GGUF

Updated Mar 5 • 158 • 1