Fine-tuning Mistral-7B for Pashto with AutoTrain
This guide explains how to fine-tune Mistral-7B language model on Pashto datasets using Hugging Face AutoTrain.
Overview
AutoTrain is a Hugging Face service that allows you to easily fine-tune language models with minimal coding. This project uses AutoTrain to fine-tune the Mistral-7B model on Pashto datasets to create a specialized model for the ZamAI project.
Prerequisites
- A Hugging Face account with Pro subscription (for access to AutoTrain)
- Hugging Face API token with write access
- Your dataset uploaded to the Hugging Face Hub in the correct format
Dataset Format
Your dataset should be uploaded to the Hugging Face Hub and should have the following columns:
text
: For single-sequence training (complete context for training)- OR
instruction
: For the prompt/instructionresponse
: For the expected model response
Using the Script
The script autotrain_finetune.py
handles the configuration and submission of your fine-tuning job to AutoTrain.
Installation
Make sure you have the required packages installed:
pip install autotrain-advanced
Running the Script
python autotrain_finetune.py \
--project_name "ZamAI-Mistral-7B-Pashto" \
--model "mistralai/Mistral-7B-v0.1" \
--dataset "tasal9/ZamAI_Pashto_Training" \
--text_column "text" \
--lr 2e-4 \
--epochs 3 \
--lora_r 16 \
--lora_alpha 32 \
--lora_dropout 0.05 \
--hf_token "YOUR_HF_TOKEN"
Arguments
--project_name
: Name for your AutoTrain project--model
: Base model to fine-tune (e.g., "mistralai/Mistral-7B-v0.1")--dataset
: Dataset on HF Hub to use for training--text_column
: Column name containing the formatted text--lr
: Learning rate for training--epochs
: Number of training epochs--lora_r
: LoRA attention dimension--lora_alpha
: LoRA alpha parameter--lora_dropout
: LoRA attention dropout--hf_token
: Hugging Face API token (required)
Monitoring Your Training Job
After submitting your job:
- Go to the AutoTrain dashboard
- Find your project by name
- Monitor training progress, logs, and metrics
Using Your Fine-tuned Model
Once training is complete, your model will be automatically pushed to the Hugging Face Hub at:
https://huggingface.co/tasal9/[PROJECT_NAME]
You can then use the model like any other Hugging Face model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "tasal9/ZamAI-Mistral-7B-Pashto" # Replace with your model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# For LoRA models, you need to merge the adapter first
model = model.merge_and_unload()
# Generate text
prompt = "ستاسو نوم څه دی؟" # What is your name?
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)