Swahili Gemma 1B - GGUF
Quantized GGUF versions of Swahili Gemma 1B, a fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.
π Translation Performance
Model Comparison
Model | Parameters | BLEU | chrF++ | Efficiency* |
---|---|---|---|---|
Gemma 3 4B | 4B | 10.9 | 44.1 | 2.7 |
Swahili Gemma 1B | 1B | 27.6 | 56.8 | 27.6 |
Gemma 3 27B | 27B | 29.4 | 60.0 | 1.1 |
GPT-5 Mini | ~8B | 31.8 | 62.4 | 4.0 |
Gemini 2.0 Flash | Large | 35.6 | 64.6 | N/A |
*Efficiency = BLEU Score / Parameters (in billions)
Key Performance Insights
π― Efficiency Leader: Achieves the highest BLEU-to-parameter ratio (27.6 BLEU per billion parameters)
π Size Advantage: Outperforms Gemma 3 4B (4x larger) by 153% on BLEU score
π Competitive Quality: Achieves 94% of Gemma 3 27B performance with 27x fewer parameters
β‘ Practical Deployment: Runs efficiently on consumer hardware while maintaining quality
Evaluation Details
- Dataset: FLORES-200 EnglishβSwahili (1,012 translation pairs)
- Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
- Evaluation: Zero-shot translation performance
π Quick Start
# Download the recommended Q4_K_M quantization
pip install huggingface_hub
# Python download
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="CraneAILabs/swahili-gemma-1b-GGUF",
local_dir="swahili-gemma-1b-GGUF",
allow_patterns=["Q4_K_M/*"] # Download only Q4_K_M version
)
π Available Quantizations
Quantization | Folder | File Size | Quality | Use Case |
---|---|---|---|---|
F32 |
F32/ | ~3.8GB | Highest | Research & benchmarking |
F16 |
F16/ | ~1.9GB | Highest | Maximum quality inference |
Q8_0 |
Q8_0/ | ~1.0GB | Very High | Production with ample resources |
Q5_K_M |
Q5_K_M/ | ~812MB | High | Balanced quality/size |
Q4_K_M |
Q4_K_M/ | ~769MB | Good | Recommended for most users |
Q4_K_S |
Q4_K_S/ | ~745MB | Good | Resource-constrained environments |
Q3_K_M |
Q3_K_M/ | ~689MB | Fair | Mobile/edge deployment |
Q2_K |
Q2_K/ | ~658MB | Lower | Minimal resource usage |
π» Usage with llama.cpp
Basic Translation
# English to Swahili translation
./llama-cli \
--model swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf \
--prompt "Translate to Swahili: Hello, how are you today?" \
--temp 0.3 \
--top-p 0.95 \
--top-k 64 \
--repeat-penalty 1.1 \
-n 128
π§ Usage with Ollama
# Create model from GGUF
ollama create swahili-gemma-1b -f Modelfile
# Use for translation
ollama run swahili-gemma-1b "Translate to Swahili: Good morning!"
# Use for conversation
ollama run swahili-gemma-1b "Hujambo! Je, unaweza kunisaidia?"
Modelfile Example
FROM swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
π Usage with Python (llama-cpp-python)
from llama_cpp import Llama
# Initialize model
llm = Llama(
model_path="swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf",
n_ctx=2048,
n_threads=8,
verbose=False
)
# Generate translation
response = llm(
"Translate to Swahili: Hello, how are you today?",
max_tokens=128,
temperature=0.3,
top_p=0.95,
top_k=64,
repeat_penalty=1.1
)
print(response['choices'][0]['text'])
π Language Capabilities
- Input Languages: English + Swahili
- Output Language: Swahili only
- Primary Focus: English-to-Swahili translation and Swahili conversation
π Performance Metrics
Translation Quality (BLEU Scores)
Model | BLEU Score | chrF++ |
---|---|---|
π₯ Swahili Gemma 1B | 23.64 | 52.26 |
π₯ ChatGPT-4o-latest | [TBD] | [TBD] |
π₯ Other Models | [TBD] | [TBD] |
Evaluated on 1,012 English-to-Swahili translation samples.
π― Capabilities
- Translation: English-to-Swahili translation
- Conversational AI: Natural dialogue in Swahili
- Summarization: Text summarization in Swahili
- Writing: Creative and informational writing in Swahili
- Question Answering: General knowledge responses in Swahili
π‘ Recommended Parameters
# Optimal settings for translation tasks
--temp 0.3
--top-p 0.95
--top-k 64
--repeat-penalty 1.1
--ctx-size 2048
π Related Models
- Original Model: CraneAILabs/swahili-gemma-1b - Full precision HuggingFace model
- LiteRT Mobile: CraneAILabs/swahili-gemma-1b-litert - Mobile deployment
- Ollama: crane-ai-labs/swahili-gemma-1b - Ready-to-run models
π οΈ Technical Details
- Base Model: google/gemma-3-1b-it
- Architecture: Gemma 3
- Context Length: 4,096 tokens
- Quantization: GGML format with multiple precision levels
- Compatible: llama.cpp, Ollama, Jan, LM Studio, and other GGUF engines
π¨ Use Cases
- Offline Translation: Run Swahili translation without internet
- Local AI Assistant: Swahili conversational AI on your machine
- Educational Tools: Language learning applications
- Content Creation: Generate Swahili content locally
- Research: Swahili language model experiments
β οΈ Limitations
- Language Output: Responds only in Swahili
- Quantization Trade-offs: Lower bit quantizations may reduce quality
- Context Limit: 4K tokens for optimal performance
- Specialized Tasks: May need fine-tuning for specific domains
π License
This model is released under the Gemma Terms of Use. Please review the terms before use.
π Acknowledgments
- Google: For the Gemma 3 base model, support and guidance.
- Community: For Swahili language resources and datasets
- Gilbert Korir (Msingi AI, Nairobi, Kenya)
- Alfred Malengo Kondoro (Hanyang University, Seoul, South Korea)
Citation
If you use these GGUF quantizations in your research or applications, please cite:
@misc{crane_ai_labs_2025,
author = {Bakunga Bronson and Kato Steven Mubiru and Lwanga Caleb and Gimei Alex and Kavuma Lameck and Roland Ganafa and Sibomana Glorry and Atuhaire Collins and JohnRoy Nangeso and Tukamushaba Catherine},
title = {Swahili Gemma: A Fine-tuned Gemma 3 1B Model for Swahili conversational AI},
year = {2025},
url = {https://huggingface.co/CraneAILabs/swahili-gemma-1b},
organization = {Crane AI Labs}
}
Built with β€οΈ by Crane AI Labs
Swahili Gemma - Your helpful Swahili AI companion, optimized for local deployment
- Downloads last month
- 47
2-bit
3-bit
4-bit
5-bit
8-bit
16-bit
32-bit
Model tree for CraneAILabs/swahili-gemma-1b-GGUF
Base model
google/gemma-3-1b-pt