πŸ“Œ Overview

A 4-bit AWQ quantized version of Google/medgemma-4b-it optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of medgemma-4b while enabling deployment on edge devices.

Downloads last month
15
Safetensors
Model size
753M params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Goraint/medgemma-4b-it-MLX-AWQ-4bit

Finetuned
(9)
this model