Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

2,466

Full-text search

Active filters: quantized

duydq12/Qwen2.5-Coder-3B-Instruct-FP8-dynamic

Text Generation • 3B • Updated Jun 9 • 5

nikodoz/qwen2.5-7b-instruct-int4

Text Generation • 4B • Updated Jun 10 • 3 • 1

ReallyFloppyPenguin/Holo1-3B-GGUF

3B • Updated Jun 10 • 74 • 2

QuantStack/Phantom_Wan_14B_FusionX-GGUF

Image-to-Video • 14B • Updated Jun 12 • 2.09k • 31

LibraxisAI/Qwen3-235B-A22B-MLX-Q5

Text Generation • 235B • Updated Jun 12 • 54

sausheong/lexsg

Text Generation • 8B • Updated Jun 12

botirk/tiny-prompt-task-complexity-classifier

Text Classification • Updated Jun 12 • 7 • 1

Argonaut790/PA-stage2-Qwen7B-147-GGUF

8B • Updated Jun 12 • 3

komixenon/Llama-Sahabat-AI-v2-70B-IT-GGUF

Text Generation • 71B • Updated Jun 14 • 21

ReallyFloppyPenguin/sarvam-m-GGUF

24B • Updated Jun 14 • 24 • 1

ragunath-ravi/quantized-whisper-mini-ta

Automatic Speech Recognition • Updated Jun 15

cello78/sambanova-llama2-100-gguf-q8

Text Generation • 7B • Updated Jun 14 • 6

althayr/Gemma-3-Gaia-PT-BR-4b-it-GGUF

Text Generation • 4B • Updated Jun 14 • 54

althayr/Gemma-3-Gaia-PT-BR-4b-it-Q8_0-GGUF

Text Generation • 4B • Updated Jun 14 • 5

ReallyFloppyPenguin/DeepSeek-R1-0528-Qwen3-8B-GGUF

8B • Updated Jul 5 • 66

ReallyFloppyPenguin/MiniCPM4-8B-GGUF

8B • Updated Jun 14 • 13

ReallyFloppyPenguin/Nemotron-Research-Reasoning-Qwen-1.5B-GGUF

2B • Updated Jun 14 • 28 • 1

Renugadevi82/cisco-nx-ai-4bit

1B • Updated Jun 16 • 7

ReallyFloppyPenguin/OpenCodeReasoning-Nemotron-14B-GGUF

15B • Updated Jun 16 • 39 • 1

ReallyFloppyPenguin/Jan-nano-GGUF

4B • Updated Jun 16 • 42

ReallyFloppyPenguin/Qwen2.5-Math-7B-GGUF

ReallyFloppyPenguin/Qwen3-0.6B-GGUF

0.8B • Updated Jun 16 • 62

ReallyFloppyPenguin/Holo1-7B-GGUF

8B • Updated Jun 16 • 38

janni-t/qwen3-embedding-0.6b-int8-tei-onnx

Sentence Similarity • Updated Jun 17 • 14

steampunque/Qwen3-4B-Hybrid-GGUF

4B • Updated Jun 17 • 7

ReallyFloppyPenguin/DeepSeek-R1-Distill-Qwen-32B-GGUF

33B • Updated Jul 5 • 11

ReallyFloppyPenguin/Gemma-3-Gaia-PT-BR-4b-it-GGUF

4B • Updated Jun 17 • 57

steampunque/Qwen3-32B-Hybrid-GGUF

33B • Updated Jun 17 • 35

yukihamada/buzzquan-sensei-q8

Text Generation • Updated Jun 18 • 13

yukihamada/buzzquan-student-q8

Text Generation • Updated Jun 18 • 4