Swahili Gemma 1B - GGUF

Quantized GGUF versions of Swahili Gemma 1B, a fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

📊 Translation Performance

Model Comparison

Model	Parameters	BLEU	chrF++	Efficiency*
Gemma 3 4B	4B	10.9	44.1	2.7
Swahili Gemma 1B	1B	27.6	56.8	27.6
Gemma 3 27B	27B	29.4	60.0	1.1
GPT-5 Mini	~8B	31.8	62.4	4.0
Gemini 2.0 Flash	Large	35.6	64.6	N/A

*Efficiency = BLEU Score / Parameters (in billions)

Key Performance Insights

🎯 Efficiency Leader: Achieves the highest BLEU-to-parameter ratio (27.6 BLEU per billion parameters)
🚀 Size Advantage: Outperforms Gemma 3 4B (4x larger) by 153% on BLEU score
💎 Competitive Quality: Achieves 94% of Gemma 3 27B performance with 27x fewer parameters
⚡ Practical Deployment: Runs efficiently on consumer hardware while maintaining quality

Evaluation Details

Dataset: FLORES-200 English→Swahili (1,012 translation pairs)
Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
Evaluation: Zero-shot translation performance

🚀 Quick Start

# Download the recommended Q4_K_M quantization
pip install huggingface_hub

# Python download
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="CraneAILabs/swahili-gemma-1b-GGUF",
    local_dir="swahili-gemma-1b-GGUF",
    allow_patterns=["Q4_K_M/*"]  # Download only Q4_K_M version
)

📊 Available Quantizations

Quantization	Folder	File Size	Quality	Use Case
`F32`	F32/	~3.8GB	Highest	Research & benchmarking
`F16`	F16/	~1.9GB	Highest	Maximum quality inference
`Q8_0`	Q8_0/	~1.0GB	Very High	Production with ample resources
`Q5_K_M`	Q5_K_M/	~812MB	High	Balanced quality/size
`Q4_K_M`	Q4_K_M/	~769MB	Good	Recommended for most users
`Q4_K_S`	Q4_K_S/	~745MB	Good	Resource-constrained environments
`Q3_K_M`	Q3_K_M/	~689MB	Fair	Mobile/edge deployment
`Q2_K`	Q2_K/	~658MB	Lower	Minimal resource usage

💻 Usage with llama.cpp

Basic Translation

# English to Swahili translation
./llama-cli \
  --model swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf \
  --prompt "Translate to Swahili: Hello, how are you today?" \
  --temp 0.3 \
  --top-p 0.95 \
  --top-k 64 \
  --repeat-penalty 1.1 \
  -n 128

🔧 Usage with Ollama

# Create model from GGUF
ollama create swahili-gemma-1b -f Modelfile

# Use for translation
ollama run swahili-gemma-1b "Translate to Swahili: Good morning!"

# Use for conversation  
ollama run swahili-gemma-1b "Hujambo! Je, unaweza kunisaidia?"

Modelfile Example

FROM swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

🐍 Usage with Python (llama-cpp-python)

from llama_cpp import Llama

# Initialize model
llm = Llama(
    model_path="swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf",
    n_ctx=2048,
    n_threads=8,
    verbose=False
)

# Generate translation
response = llm(
    "Translate to Swahili: Hello, how are you today?",
    max_tokens=128,
    temperature=0.3,
    top_p=0.95,
    top_k=64,
    repeat_penalty=1.1
)

print(response['choices'][0]['text'])

🌍 Language Capabilities

Input Languages: English + Swahili
Output Language: Swahili only
Primary Focus: English-to-Swahili translation and Swahili conversation

📊 Performance Metrics

Translation Quality (BLEU Scores)

Model	BLEU Score	chrF++
🥇 Swahili Gemma 1B	23.64	52.26
🥈 ChatGPT-4o-latest	[TBD]	[TBD]
🥉 Other Models	[TBD]	[TBD]

Evaluated on 1,012 English-to-Swahili translation samples.

🎯 Capabilities

Translation: English-to-Swahili translation
Conversational AI: Natural dialogue in Swahili
Summarization: Text summarization in Swahili
Writing: Creative and informational writing in Swahili
Question Answering: General knowledge responses in Swahili

💡 Recommended Parameters

# Optimal settings for translation tasks
--temp 0.3
--top-p 0.95
--top-k 64
--repeat-penalty 1.1
--ctx-size 2048

🔗 Related Models

Original Model: CraneAILabs/swahili-gemma-1b - Full precision HuggingFace model
LiteRT Mobile: CraneAILabs/swahili-gemma-1b-litert - Mobile deployment
Ollama: crane-ai-labs/swahili-gemma-1b - Ready-to-run models

🛠️ Technical Details

Base Model: google/gemma-3-1b-it
Architecture: Gemma 3
Context Length: 4,096 tokens
Quantization: GGML format with multiple precision levels
Compatible: llama.cpp, Ollama, Jan, LM Studio, and other GGUF engines

🎨 Use Cases

Offline Translation: Run Swahili translation without internet
Local AI Assistant: Swahili conversational AI on your machine
Educational Tools: Language learning applications
Content Creation: Generate Swahili content locally
Research: Swahili language model experiments

⚠️ Limitations

Language Output: Responds only in Swahili
Quantization Trade-offs: Lower bit quantizations may reduce quality
Context Limit: 4K tokens for optimal performance
Specialized Tasks: May need fine-tuning for specific domains

📄 License

This model is released under the Gemma Terms of Use. Please review the terms before use.

🙏 Acknowledgments

Google: For the Gemma 3 base model, support and guidance.
Community: For Swahili language resources and datasets
Gilbert Korir (Msingi AI, Nairobi, Kenya)
Alfred Malengo Kondoro (Hanyang University, Seoul, South Korea)

Citation

If you use these GGUF quantizations in your research or applications, please cite:

@misc{crane_ai_labs_2025,
    author    = {Bakunga Bronson and Kato Steven Mubiru and Lwanga Caleb and Gimei Alex and Kavuma Lameck and Roland Ganafa and Sibomana Glorry and Atuhaire Collins and JohnRoy Nangeso and Tukamushaba Catherine},
    title     = {Swahili Gemma: A Fine-tuned Gemma 3 1B Model for Swahili conversational AI},
    year      = {2025},
    url       = {https://huggingface.co/CraneAILabs/swahili-gemma-1b},
    organization = {Crane AI Labs}
}

Built with ❤️ by Crane AI Labs

Swahili Gemma - Your helpful Swahili AI companion, optimized for local deployment

CraneAILabs
/

swahili-gemma-1b-GGUF