Gemma-3-270m Buddha-QA (LoRA 4-bit)

Model Details

Developed by: sweatSmile
Base model: google/gemma-3-270m
Fine-tuning method: LoRA (Low-Rank Adaptation)
Quantization: 4-bit (nf4) with double quantization
Task type: Question Answering (QA)
Language: English
License: Apache-2.0

This model was fine-tuned on a QA dataset about Buddhist teachings, designed for lightweight question-answering tasks.

Model Sources

Repository: Hugging Face Model Repo
Dataset: sweatSmile/buddha-taught-qa

Uses

Direct Use

Educational QA about Buddhist texts.
Lightweight inference on constrained hardware (4-bit quantization).

Downstream Use

Can be adapted to other domain-specific QA tasks with further LoRA fine-tuning.

Out-of-Scope Use

Not suitable for open-domain QA beyond its training dataset.
Should not be used for sensitive or factual decision-making without verification.

Bias, Risks, and Limitations

Dataset is small (699 QA pairs), so generalization is limited.
Answers are narrow and domain-specific (Buddhist context).
May generate incomplete or repetitive answers outside training distribution.

Training Details

Training Data

Dataset: sweatSmile/buddha-taught-qa (699 QA pairs).
Preprocessed into {"prompt": ..., "completion": ...} format.

Training Procedure

Frameworks: PEFT + TRL + Transformers
Precision: 4-bit quantization (nf4, double quantization, compute dtype = bf16 if supported)
LoRA Config:
- r = 8
- lora_alpha = 16
- lora_dropout = 0.1
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Hyperparameters:
- effective_batch_size = 6
- gradient_accumulation_steps = 3
- num_train_epochs = 8
- learning_rate = 2e-4
- lr_scheduler_type = cosine
- warmup_ratio = 0.05
- weight_decay = 0.01
- max_grad_norm = 1.0
- fp16 = True
- max_seq_length = 64
- save_total_limit = 2
- Logging & checkpoint every 15 steps

Results

Global steps: 256
Final training loss: ~1.81
Train runtime: ~373s
Train samples/sec: ~12
Train steps/sec: ~0.69

Evaluation

Qualitative Examples

Prompt	Completion
Who is referred to as the Fully-Enlightened One in the text?	The Buddha is referred to as the Fully-Enlightened One.
Why did the speaker become a recluse?	The speaker became a recluse in the name of the Blessed One, his master.
Where does the Fully-Enlightened One live according to the text?	The Fully-Enlightened One lives in a city to the north, in India.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "sweatSmile/Gemma-3-270m-Buddha-QA"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo)

inputs = tokenizer("Who is referred to as the Fully-Enlightened One in the text?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Environmental Impact

Hardware: Single GPU (Colab T4)
Precision: 4-bit + mixed precision
Training duration: ~20 minutes
Carbon footprint: negligible compared to large-scale LLMs.

Citation

If you use this model, please cite:

@misc{gemma-buddha-qa-2025,
  title = {Gemma-3-270m Buddha-QA (LoRA 4-bit)},
  author = {sweatSmile},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/sweatSmile/Gemma-3-270m-Buddha-QA}}
}

Contact

Author: sweatSmile
Hugging Face: profile

sweatSmile
/

Gemma-3-270m-Buddha-QA

Gemma-3-270m Buddha-QA (LoRA 4-bit)

Model Details

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Training Details

Training Data

Training Procedure

Results

Evaluation

Qualitative Examples

How to Get Started with the Model

Environmental Impact

Citation

Contact

Model tree for sweatSmile/Gemma-3-270m-Buddha-QA

Dataset used to train sweatSmile/Gemma-3-270m-Buddha-QA

Evaluation results