Fine-Tuned Gemma-2B with QLoRA on English Quotes (Author & Tags Prediction)

This model is a fine-tuned version of google/gemma-2-2b, using QLoRA and PEFT (LoRAConfig) techniques to train on a conversational version of the Abirate/english_quotes dataset.

The goal is to predict the author and tags of a quote, formatted using ChatML-style prompts, making it suitable for lightweight conversational applications or metadata generation.

✨ Model Summary

Base model: google/gemma-2-2b
Parameter-efficient fine-tuning: LoRA (r=64, alpha=16, dropout=0.1)
Quantization: 4-bit QLoRA (via BitsAndBytes)
Training Data: 2,000 English quotes with author + tags
Prompt format: ChatML (multi-turn)
Language: English
Model type: Decoder-only causal LM
License: Gemma Terms of Use

🧠 How It Works

Each training example was transformed into the following ChatML format:


\<start\_of\_turn>user
"Be yourself; everyone else is already taken."
\<end\_of\_turn>
\<start\_of\_turn>model
Author: Oscar Wilde
Tags: inspirational, self, identity
\<end\_of\_turn>

The model learns to generate structured metadata in a natural language response.

📦 How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "SCCSMARTCODE/finetuned-gemma2b-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

prompt = (
    "<start_of_turn>user\n"
    "“Be yourself; everyone else is already taken.”\n"
    "<end_of_turn>\n"
    "<start_of_turn>model\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=32)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🗂️ Dataset

Name: Abirate/english_quotes
Fields used: "quote", "author", "tags"
Size: 2,000 examples used for fine-tuning

🏋️ Training Details

Frameworks: Transformers, TRL, PEFT, BitsAndBytes
Compute: Colab T4 / A100 (mixed precision)
Epochs: 1
Batch size: 1 (with gradient accumulation = 16)
Optimizer: paged_adamw_8bit
LR scheduler: Cosine
Learning rate: 2e-4
Mixed precision: fp16
Quantization: 4-bit via QLoRA (bnb_4bit)

📈 Intended Use

✅ Direct Use

Conversational agents generating metadata for quotes
Training demos for QLoRA + LoRA on limited compute
Style-aligned structured generation in lightweight applications

🚫 Out-of-Scope Use

Any high-stakes decision-making
Factual attribution in academic or legal domains
Non-English quote metadata extraction

⚠️ Bias, Risks & Limitations

Cultural bias: Author predictions are based on dataset exposure and may reflect selection bias.
Dataset limitations: Author/tag mappings are not always consistent or exhaustive.
Small scale: The model was trained on a small subset (2,000 samples), which limits generalization.

🧪 Evaluation

Informal evaluation shows the model correctly extracts authors/tags for known quotes, but performance may degrade for rare or noisy examples.

🧾 Citation

BibTeX:

@misc{gemma-quotes-sft,
  author = {Emmanuel Ayobami Adewumi},
  title = {Fine-Tuned Gemma-2B on English Quotes for Author and Tag Prediction},
  year = 2025,
  howpublished = {\url{https://huggingface.co/your-username/fine-tuned-gemma-quotes}},
  note = {Fine-tuned using QLoRA + PEFT}
}

🙋 Contact

Created by Emmanuel Ayobami Adewumi For questions or feedback, reach out on Hugging Face or GitHub

🏁 Future Work

Expand dataset to 10k+ quotes for better generalization
Add author style generation (not just metadata)
Serve on Gradio with editable quote inputs

SCCSMARTCODE
/

finetuned-gemma2b-lora