Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

67

Full-text search

Active filters: neuralmagic

RedHatAI/Qwen3-8B-speculator.eagle3

Text Generation • 1B • Updated Jul 29 • 433 • 3

nm-testing/Llama-3.1-8B-Instruct-speculator.eagle3-converted

Text Generation • 1.0B • Updated Jul 30 • 3

RedHatAI/SmolLM3-3B-quantized.w4a16

0.9B • Updated Jul 31 • 14

LiyuanLucasLiu/Qwen2.5-0.5B-Instruct-quantized.w8a8-RedHatAI

Text Generation • 0.6B • Updated Aug 4 • 446

RedHatAI/Devstral-Small-2507-FP8-Dynamic

Text Generation • 24B • Updated 14 days ago • 119

RedHatAI/Devstral-Small-2507-quantized.w8a8

Text Generation • 24B • Updated 14 days ago • 529 • 1

RedHatAI/Devstral-Small-2507-quantized.w4a16

Text Generation • 4B • Updated 14 days ago • 2.24k