Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

30,993

Base only

Active filters: 8-bit

nvidia/Qwen3.6-27B-NVFP4

Text Generation • 18B • Updated 5 days ago • 185k • 257

deepseek-ai/DeepSeek-V4-Pro-DSpark

Text Generation • 889B • Updated 1 day ago • 10.3k • 373

deepseek-ai/DeepSeek-V4-Flash-DSpark

Text Generation • 165B • Updated 1 day ago • 40.3k • 157

nvidia/GLM-5.2-NVFP4

Text Generation • 381B • Updated 8 days ago • 237k • 229

nvidia/Qwen3.6-35B-A3B-NVFP4

Text Generation • 19B • Updated 22 days ago • 6.35M • 425

deepseek-ai/DeepSeek-V4-Flash

Text Generation • 158B • Updated 13 days ago • 2.15M • • 1.69k

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 13 days ago • 1.24M • 5.15k

AEON-7/Ornith-1.0-35B-AEON-Ultimate-Uncensored-NVFP4

Text Generation • 21B • Updated 6 days ago • 10.4k • 42

google/gemma-4-E2B-it-qat-mobile-transformers

Any-to-Any • 2B • Updated 29 days ago • 23.4k • 110

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.92M • • 4.76k

nvidia/MiniMax-M3-NVFP4

Text Generation • 247B • Updated 9 days ago • 54.7k • 49

nvidia/Mistral-Medium-3.5-128B-NVFP4

Text Generation • 84B • Updated 4 days ago • 5.24k • 18

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 4.12M • • 4.94k

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4

Text Generation • 335B • Updated 10 days ago • 487k • • 236

0xSero/GLM-5.2-504B-Nvidia

Text Generation • 293B • Updated 8 days ago • 814 • 19

PhalaCloud/GLM-5.2-W4AFP8

Text Generation • 392B • Updated 13 days ago • 21.5k • 33

0xSero/GLM-5.2-504B

Text Generation • 290B • Updated 10 days ago • 18.2k • 29

OpenYourMind/GLM-5.2-abliterated

432B • Updated 6 days ago • 23

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

Text Generation • 67B • Updated May 1 • 1.04M • 379

nvidia/Gemma-4-26B-A4B-NVFP4

Text Generation • 14B • Updated May 11 • 2.2M • 108

meituan-longcat/LongCat-2.0-INT8

Text Generation • 1.8T • Updated about 21 hours ago • 9

nvidia/DeepSeek-V4-Flash-NVFP4

Text Generation • 167B • Updated 20 days ago • 394k • 57

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 8.3k • 1.47k

nvidia/DeepSeek-V4-Pro-NVFP4

Text Generation • 910B • Updated 21 days ago • 162k • 69

XiaomiMiMo/MiMo-V2.5-Pro-FP4-DFlash

Text Generation • 554B • Updated 27 days ago • 46.9k • 142

poolside/Laguna-XS-2.1-NVFP4

Text Generation • 20B • Updated 3 days ago • 567 • 6

fraserprice/DeepSeek-V4-Flash-Abliterated-DSpark

165B • Updated 4 days ago • 345 • 6

saricles/MiniMax-M2.7-REAP-172B-A10B-NVFP4-GB10

Text Generation • 87B • Updated Apr 19 • 4.69k • 32

unsloth/Qwen3.6-27B-NVFP4

Image-Text-to-Text • 19B • Updated May 31 • 1.24M • 100

mlx-community/gemma-4-12B-it-8bit

Image-Text-to-Text • 3B • Updated 27 days ago • 57.3k • 40