A 4-bit AWQ quantized version of Google/medgemma-4b-it optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of medgemma-4b while enabling deployment on edge devices.