---
license: gemma
language:
- en
- zh
- es
base_model:
- google/gemma-3-4b-it
tags:
- Google
- Gemma3
- GGUF
- 4b-it
---

# Google Gemma 3 4B Instruction-Tuned GGUF Quantized Models

This repository contains GGUF quantized versions of [Google's Gemma 3 4B instruction-tuned model](https://huggingface.co/google/gemma-3-4b-it), optimized for efficient deployment across various hardware configurations.

## Quantization Results

| Model | Size | Compression Ratio | Size Reduction |
|-------|------|-------------------|---------------|
| Q8_0  | 4.1 GB | 53% | 47% |
| Q6_K  | 3.2 GB | 41% | 59% |
| Q4_K  | 2.5 GB | 32% | 68% |
| Q2_K  | 1.7 GB | 22% | 78% |

## Quality vs Size Trade-offs

- **Q8_0**: Near-lossless quality, minimal degradation compared to F16
- **Q6_K**: Very good quality, slight degradation in some rare cases
- **Q4_K**: Decent quality, noticeable degradation but still usable for most tasks
- **Q2_K**: Heavily reduced quality, substantial degradation but smallest file size

## Recommendations

- For **maximum quality**: Use F16 or Q8_0
- For **balanced performance**: Use Q6_K
- For **minimum size**: Use Q2_K
- For **most use cases**: Q4_K provides a good balance of quality and size

## Usage with llama.cpp

These models can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and its various interfaces. Example:

```bash
# Running with llama-gemma3-cli.exe (adjust paths as needed)
./llama-gemma3-cli --model gemma-3-4b-it-q4k.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."
```

## License

This model is released under the same [Gemma license](https://ai.google.dev/gemma/terms) as the original model.

## Original Model Information

This quantized set is derived from [Google's Gemma 3 4B instruction-tuned model](https://huggingface.co/google/gemma-3-4b-it).

### Model Specifications
- **Architecture**: Gemma 3
- **Size Label**: 4B
- **Type**: Instruction-tuned
- **Context Length**: 131K tokens
- **Embedding Length**: 2560
- **Languages**: Support for multiple languages

## Citation & Attribution

```
@article{gemma_2025,
    title={Gemma 3},
    url={https://goo.gle/Gemma3Report},
    publisher={Kaggle},
    author={Gemma Team},
    year={2025}
}

@misc{gemma3_quantization_2025,
    title={Quantized Versions of Google's Gemma 3 27B Model},
    author={Lex-au},
    year={2025},
    month={March},
    note={Quantized models (Q8_0, Q6_K, Q4_K, Q2_K) derived from Google's Gemma 3 4B},
    url={https://huggingface.co/lex-au}
}
```