metadata

license: gemma
language:
  - en
  - zh
  - es
base_model:
  - google/gemma-3-1b-it
tags:
  - Google
  - Gemma3
  - GGUF
  - 1b-it

Google Gemma 3 1B Instruction-Tuned GGUF Quantized Models

This repository contains GGUF quantized versions of Google's Gemma 3 1B instruction-tuned model, optimized for efficient deployment across various hardware configurations.

Quantization Results

Model	Size	Compression Ratio	Size Reduction
Q8_0	1.07 GB	54%	46%
Q6_K	1.01 GB	51%	49%
Q4_K	0.81 GB	40%	60%
Q2_K	0.69 GB	34%	66%

Quality vs Size Trade-offs

Q8_0: Near-lossless quality, minimal degradation compared to F16
Q6_K: Very good quality, slight degradation in some rare cases
Q4_K: Decent quality, noticeable degradation but still usable for most tasks
Q2_K: Heavily reduced quality, substantial degradation but smallest file size

Recommendations

For maximum quality: Use Q8_0
For balanced performance: Use Q6_K
For minimum size: Use Q2_K
For most use cases: Q4_K provides a good balance of quality and size

Usage with llama.cpp

These models can be used with llama.cpp and its various interfaces. Example:

# Running with llama-gemma3-cli.exe (adjust paths as needed)
./llama-gemma3-cli --model Google.Gemma-3-1b-it-Q4_K.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."

License

This model is released under the same Gemma license as the original model.

Original Model Information

This quantized set is derived from Google's Gemma 3 1B instruction-tuned model.

Model Specifications

Architecture: Gemma 3
Size Label: 1B
Type: Instruction-tuned
Context Length: 32K tokens
Embedding Length: 2048
Languages: Support for multiple languages

Citation & Attribution

@article{gemma_2025,
    title={Gemma 3},
    url={https://goo.gle/Gemma3Report},
    publisher={Kaggle},
    author={Gemma Team},
    year={2025}
}

@misc{gemma3_quantization_2025,
    title={Quantized Versions of Google's Gemma 3 1B Model},
    author={Lex-au},
    year={2025},
    month={March},
    note={Quantized models (Q8_0, Q6_K, Q4_K, Q2_K) derived from Google's Gemma 3 1B},
    url={https://huggingface.co/lex-au}
}