Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

302

Full-text search

Active filters: 4bit

TheCluster/VL-Rethinker-72B-mlx-4bit

Visual Question Answering • Updated Apr 19 • 10

BoltMonkey/boltmonkey_shortreasoning-8b

Text Generation • 8B • Updated Apr 18 • 6

BoltMonkey/boltmonkey_shortreasoning-8b-Q5_K_M-GGUF

Text Generation • 8B • Updated Apr 18 • 3

TheCluster/Comet_12B_V.4-mlx-4bit

Image-Text-to-Text • Updated Apr 23 • 14

TechyCode/tinyllama-sciq-lora

Text Generation • Updated Apr 23

TheCluster/Amoral-Fallen-Omega-Gemma3-12B-mlx-4bit

Image-Text-to-Text • Updated Apr 23 • 130 • 2

Sumo10/Phi-4-mini-instruct-AWQ-4bit

1B • Updated Apr 25 • 58 • 1

Sumo10/Llama-3.2-3B-Instruct-AWQ-4bit

0.8B • Updated Apr 25 • 3

cyberandy/SEOcrate-4B_grpo_new_01

Text Generation • 4B • Updated May 8 • 618 • 6

Chun121/qwen3-4B-rpg-roleplay

Text Generation • 4B • Updated Jul 7 • 1.16k • 14

taetae030/fin-term-model

5B • Updated May 4 • 24 • 1

SujitShelar/llama3-medchat-8b-lora

Question Answering • Updated May 5

boods/mistral-location-extractor-4bit

Text Generation • 4B • Updated May 7 • 287

mradermacher/SEOcrate-4B_grpo_new_01-GGUF

Reinforcement Learning • 4B • Updated Jul 11 • 2.39k • 1

mradermacher/SEOcrate-4B_grpo_new_01-i1-GGUF

Reinforcement Learning • 4B • Updated Jul 11 • 4.75k

vannishh/llama3-2.1B-4bit-finetuned

Updated May 15 • 4

Programmer-RD-AI/ResearchQwen-2.5-3B-LoRA

Question Answering • 3B • Updated May 26 • 4

CodCodingCode/DeepSeek-V2-medical

Text Generation • Updated May 18

tripolskypetr/Plutus-Meta-Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • 8B • Updated May 21 • 28

abdou-u/MNLP_M2_quantized_model

Text Generation • 0.4B • Updated May 19 • 4

HagalazAI/CyberDolphin-2.9.3-mistral-nemo-12b

Text Generation • 12B • Updated May 22 • 7

HagalazAI/CyberDolphin-2.9.3-mistral-nemo-12b-GGUF

Text Generation • 12B • Updated May 23 • 48 • 2

Jimmi42/sarvam-m-4bit-mlx

Text Generation • 4B • Updated May 26 • 13 • 1

geninhu/RakutenAI-7B-instruct-GPTQ

Updated May 30 • 4

umangshikarvar/sentiment-qlora-gptneo

Text Classification • Updated Jun 2 • 311

Fulstac/deepseek-r1-Distill-Qwen-32B-sqlgen-4bit-v1

Text Generation • 33B • Updated Jun 6 • 4

Fulstac/deepseek-r1-Distill-Qwen-32B-lora-4bit-v3

Text Generation • 33B • Updated Jun 6 • 4

acauanrr/qlora-ti-2025-adapter

Updated Jun 7 • 2

abdou-u/MNLP_M3_quantized_dpo_mcqa_model

Multiple Choice • 0.4B • Updated Jun 8 • 19

kevin510/friday-4bit

Text Generation • 2B • Updated Jun 19 • 5