๐ง Qwen2.5-1.5B - LoRA Fine-Tuned on Arabic Q&A ๐
This model is a LoRA fine-tuned version of Qwen/Qwen2.5-1.5B designed for Arabic Question Answering tasks. It was trained using the LLaMA-Factory framework on a custom curated dataset of Arabic Q&A pairs.
๐ Training Configuration
- Base Model:
Qwen/Qwen2.5-1.5B
- Method: Supervised Fine-Tuning (SFT) with LoRA
- Framework: LLaMA-Factory
- Batch Size: 1 (gradient accumulation = 16)
- Epochs: 3
- Cutoff Length: 2048 tokens
- Learning Rate: 1e-4
- Scheduler: Cosine with warmup ratio 0.1
- Precision: bf16
- LoRA Rank: 64
- LoRA Target: all layers
- Eval Strategy: every 200 steps
- Eval Set Size: 3020 examples
- WandB Tracking: Enabled
Run Link
๐ Evaluation (Epoch ~1.77)
- Eval Loss: 0.4321
- Samples/sec: 1.389
- Steps/sec: 0.695
๐ Usage
You can use the model via transformers
:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Youssef/QWEN_Arabic_Q&A")
tokenizer = AutoTokenizer.from_pretrained("Youssef/QWEN_Arabic_Q&A")
prompt = "ู
ู ูู ู
ุคุณุณ ุนูู
ุงูุฌุจุฑุ"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
<|user|>
ู
ุง ูู ุฃุฑูุงู ุงูุฅุณูุงู
ู
ุน ุฐูุฑ ุงูุญุฏูุซ ุงูุฐู ูุฐูุฑูุงุ
<|assistant|>
ุฃุฑูุงู ุงูุฅุณูุงู
ุฎู
ุณุฉุ ูู
ุง ุฌุงุก ูู ุงูุญุฏูุซ ุงูุตุญูุญ:
ุนู ุนุจุฏ ุงููู ุจู ุนู
ุฑ ุฑุถู ุงููู ุนููู
ุง ูุงู: ูุงู ุฑุณูู ุงููู ๏ทบ: "ุจูู ุงูุฅุณูุงู
ุนูู ุฎู
ุณ: ุดูุงุฏุฉ ุฃู ูุง ุฅูู ุฅูุง ุงูููุ ูุฃู ู
ุญู
ุฏูุง ุฑุณูู ุงูููุ ูุฅูุงู
ุงูุตูุงุฉุ ูุฅูุชุงุก ุงูุฒูุงุฉุ ูุตูู
ุฑู
ุถุงูุ ูุญุฌ ุงูุจูุช ูู
ู ุงุณุชุทุงุน ุฅููู ุณุจูููุง" (ุฑูุงู ุงูุจุฎุงุฑู ูู
ุณูู
).
## ๐ Training Loss Over Epochs
| Epoch | Learning Rate | Loss |
|-------|------------------------|--------|
| 0.16 | 5.39e-05 | 0.6304 |
| 0.18 | 5.88e-05 | 0.6179 |
| 0.19 | 6.37e-05 | 0.6042 |
| 0.21 | 6.86e-05 | 0.6138 |
| 0.22 | 7.35e-05 | 0.5940 |
| 0.24 | 7.84e-05 | 0.5838 |
| 0.25 | 8.33e-05 | 0.5842 |
| 0.26 | 8.82e-05 | 0.5786 |
| 0.28 | 9.31e-05 | 0.5713 |
| 0.65 | 9.60e-05 | 0.6122 |
| 0.71 | 9.45e-05 | 0.5809 |
| 0.77 | 9.29e-05 | 0.5446 |
| 0.82 | 9.10e-05 | 0.5339 |
| 0.88 | 8.90e-05 | 0.5296 |
| 0.94 | 8.67e-05 | 0.5176 |
| 1.00 | 8.43e-05 | 0.5104 |
| 1.06 | 8.17e-05 | 0.4685 |
| 1.12 | 7.90e-05 | 0.4730 |
| 1.18 | 7.62e-05 | 0.4679 |
| 1.24 | 7.32e-05 | 0.4541 |
| 1.30 | 7.01e-05 | 0.4576 |
| 1.35 | 6.69e-05 | 0.4472 |
| 1.41 | 6.36e-05 | 0.4427 |
| 1.47 | 6.03e-05 | 0.4395 |
| 1.53 | 5.69e-05 | 0.4305 |
| 1.59 | 5.35e-05 | 0.4280 |
| 1.65 | 5.01e-05 | 0.4251 |
| 1.71 | 4.67e-05 | 0.4188 |
| 1.77 | 4.33e-05 | 0.4177 |
| 1.83 | 3.99e-05 | 0.4128 |
**Evaluation Losses:**
- ๐ Epoch 1.18 โ `0.4845`
- ๐ Epoch 1.77 โ `0.4321`
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for youssefedweqd/working
Base model
Qwen/Qwen2.5-1.5B