🌸 BloomZ-1.1B LoRA Fine-tuned for English → Myanmar (Burmese) Translation

Model Name: LinoM/bloomz-1b1MM
Base Model: bigscience/bloomz-1b1
Fine-Tuning Method: QLoRA (4-bit LoRA adapters + 8-bit base model)
Frameworks: Hugging Face Transformers + PEFT + BitsAndBytes
Task: English to Myanmar Instruction-style Translation

🧠 Model Details

Detail	Value
Model Architecture	BLOOMZ
Base Model Size	1.1 Billion Parameters
Fine-tuning Method	LoRA with QLoRA (4-bit adapters)
Optimizer	`paged_adamw_8bit`
Precision	4-bit LoRA + 8-bit Base
Epochs	3–5 (variable per run)
Batch Size	32
Language Pair	English → Burmese (မြန်မာ)
Tokenizer	Bloom tokenizer (`bigscience/tokenizer`)

📚 Training Data

The model was fine-tuned on a curated mix of open datasets including:

🌍 FLORES200 (en–my)
🎬 OpenSubtitles (Movie subtitles in Myanmar)
📖 Custom Instruction-style translation datasets (8 use cases, 200+ pairs per use case)
🗣️ ai4bharat/indictrans2-en-my (additional Burmese corpora)

📈 Evaluation

Metric	Score
BLEU	35–40
Translation Style	Instructional, formal
Human Evaluation	✓ Understood grammar and tone in 85% samples

✅ The model excels at translating English prompts into formal Burmese suitable for education, scripts, and user guides.

🔧 How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-1b1", load_in_8bit=True, device_map="auto")
lora = PeftModel.from_pretrained(base, "LinoM/bloomz-1b1MM")
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-1b1")

translator = pipeline("text-generation", model=lora, tokenizer=tokenizer)

text = "Translate into Burmese: What is your favorite subject?"
output = translator(text, max_new_tokens=100)
print(output[0]['generated_text'])

LinoM
/

bloomz-1b1MM

🌸 BloomZ-1.1B LoRA Fine-tuned for English → Myanmar (Burmese) Translation

🧠 Model Details

📚 Training Data

📈 Evaluation

🔧 How to Use

Model tree for LinoM/bloomz-1b1MM