--- license: gemma language: - en - zh - es base_model: - google/gemma-3-4b-it tags: - Google - Gemma3 - GGUF - 4b-it --- # Google Gemma 3 4B Instruction-Tuned GGUF Quantized Models This repository contains GGUF quantized versions of [Google's Gemma 3 4B instruction-tuned model](https://huggingface.co/google/gemma-3-4b-it), optimized for efficient deployment across various hardware configurations. ## Quantization Results | Model | Size | Compression Ratio | Size Reduction | |-------|------|-------------------|---------------| | Q8_0 | 4.1 GB | 53% | 47% | | Q6_K | 3.2 GB | 41% | 59% | | Q4_K | 2.5 GB | 32% | 68% | | Q2_K | 1.7 GB | 22% | 78% | ## Quality vs Size Trade-offs - **Q8_0**: Near-lossless quality, minimal degradation compared to F16 - **Q6_K**: Very good quality, slight degradation in some rare cases - **Q4_K**: Decent quality, noticeable degradation but still usable for most tasks - **Q2_K**: Heavily reduced quality, substantial degradation but smallest file size ## Recommendations - For **maximum quality**: Use F16 or Q8_0 - For **balanced performance**: Use Q6_K - For **minimum size**: Use Q2_K - For **most use cases**: Q4_K provides a good balance of quality and size ## Usage with llama.cpp These models can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and its various interfaces. Example: ```bash # Running with llama-gemma3-cli.exe (adjust paths as needed) ./llama-gemma3-cli --model gemma-3-4b-it-q4k.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings." ``` ## License This model is released under the same [Gemma license](https://ai.google.dev/gemma/terms) as the original model. ## Original Model Information This quantized set is derived from [Google's Gemma 3 4B instruction-tuned model](https://huggingface.co/google/gemma-3-4b-it). ### Model Specifications - **Architecture**: Gemma 3 - **Size Label**: 4B - **Type**: Instruction-tuned - **Context Length**: 131K tokens - **Embedding Length**: 2560 - **Languages**: Support for multiple languages ## Citation & Attribution ``` @article{gemma_2025, title={Gemma 3}, url={https://goo.gle/Gemma3Report}, publisher={Kaggle}, author={Gemma Team}, year={2025} } @misc{gemma3_quantization_2025, title={Quantized Versions of Google's Gemma 3 27B Model}, author={Lex-au}, year={2025}, month={March}, note={Quantized models (Q8_0, Q6_K, Q4_K, Q2_K) derived from Google's Gemma 3 4B}, url={https://huggingface.co/lex-au} } ```