Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,111 @@
|
|
1 |
---
|
2 |
-
license: mit
|
3 |
language:
|
4 |
-
- ar
|
5 |
-
|
6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
pipeline_tag: text-generation
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
+
- ar
|
4 |
+
license: apache-2.0
|
5 |
+
tags:
|
6 |
+
- qwen
|
7 |
+
- llama-factory
|
8 |
+
- lora
|
9 |
+
- arabic
|
10 |
+
- question-answering
|
11 |
+
- instruction-tuning
|
12 |
+
- kaggle
|
13 |
+
- transformers
|
14 |
+
- fine-tuned
|
15 |
+
model_name: QWEN_Arabic_Q&A
|
16 |
+
base_model: Qwen/Qwen2.5-1.5B
|
17 |
pipeline_tag: text-generation
|
18 |
+
library_name: transformers
|
19 |
+
datasets:
|
20 |
+
- custom
|
21 |
+
---
|
22 |
+
|
23 |
+
# ๐ง Qwen2.5-1.5B - LoRA Fine-Tuned on Arabic Q&A ๐
|
24 |
+
|
25 |
+
This model is a LoRA fine-tuned version of **[Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B)** designed for Arabic Question Answering tasks. It was trained using the **LLaMA-Factory** framework on a custom curated dataset of Arabic Q&A pairs.
|
26 |
+
|
27 |
+
## ๐ Training Configuration
|
28 |
+
|
29 |
+
- **Base Model**: `Qwen/Qwen2.5-1.5B`
|
30 |
+
- **Method**: Supervised Fine-Tuning (SFT) with [LoRA](https://arxiv.org/abs/2106.09685)
|
31 |
+
- **Framework**: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
32 |
+
- **Batch Size**: 1 (gradient accumulation = 16)
|
33 |
+
- **Epochs**: 3
|
34 |
+
- **Cutoff Length**: 2048 tokens
|
35 |
+
- **Learning Rate**: 1e-4
|
36 |
+
- **Scheduler**: Cosine with warmup ratio 0.1
|
37 |
+
- **Precision**: bf16
|
38 |
+
- **LoRA Rank**: 64
|
39 |
+
- **LoRA Target**: all layers
|
40 |
+
- **Eval Strategy**: every 200 steps
|
41 |
+
- **Eval Set Size**: 3020 examples
|
42 |
+
- **WandB Tracking**: Enabled [`Run Link`](https://wandb.ai/youssefhassan437972-kafr-el-sheikh-university/llamafactory/runs/rdrftts8)
|
43 |
+
|
44 |
+
## ๐ Evaluation (Epoch ~1.77)
|
45 |
+
|
46 |
+
- **Eval Loss**: 0.4321
|
47 |
+
- **Samples/sec**: 1.389
|
48 |
+
- **Steps/sec**: 0.695
|
49 |
+
|
50 |
+
## ๐ Usage
|
51 |
+
|
52 |
+
You can use the model via `transformers`:
|
53 |
+
|
54 |
+
```python
|
55 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
56 |
+
|
57 |
+
model = AutoModelForCausalLM.from_pretrained("Youssef/QWEN_Arabic_Q&A")
|
58 |
+
tokenizer = AutoTokenizer.from_pretrained("Youssef/QWEN_Arabic_Q&A")
|
59 |
+
|
60 |
+
prompt = "ู
ู ูู ู
ุคุณุณ ุนูู
ุงูุฌุจุฑุ"
|
61 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
62 |
+
outputs = model.generate(**inputs, max_new_tokens=100)
|
63 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
64 |
+
|
65 |
+
|
66 |
+
<|user|>
|
67 |
+
ู
ุง ูู ุฃุฑูุงู ุงูุฅุณูุงู
ู
ุน ุฐูุฑ ุงูุญุฏูุซ ุงูุฐู ูุฐูุฑูุงุ
|
68 |
+
<|assistant|>
|
69 |
+
ุฃุฑูุงู ุงูุฅุณูุงู
ุฎู
ุณุฉุ ูู
ุง ุฌุงุก ูู ุงูุญุฏูุซ ุงูุตุญูุญ:
|
70 |
+
|
71 |
+
ุนู ุนุจุฏ ุงููู ุจู ุนู
ุฑ ุฑุถู ุงููู ุนููู
ุง ูุงู: ูุงู ุฑุณูู ุงููู ๏ทบ: "ุจูู ุงูุฅุณูุงู
ุนูู ุฎู
ุณ: ุดูุงุฏุฉ ุฃู ูุง ุฅูู ุฅูุง ุงูููุ ูุฃู ู
ุญู
ุฏูุง ุฑุณูู ุงูููุ ูุฅูุงู
ุงูุตูุงุฉุ ูุฅูุชุงุก ุงูุฒูุงุฉุ ูุตูู
ุฑู
ุถุงูุ ูุญุฌ ุงูุจูุช ูู
ู ุงุณุชุทุงุน ุฅููู ุณุจูููุง" (ุฑูุงู ุงูุจุฎุงุฑู ูู
ุณูู
).
|
72 |
+
|
73 |
+
## ๐ Training Loss Over Epochs
|
74 |
+
|
75 |
+
| Epoch | Learning Rate | Loss |
|
76 |
+
|-------|------------------------|--------|
|
77 |
+
| 0.16 | 5.39e-05 | 0.6304 |
|
78 |
+
| 0.18 | 5.88e-05 | 0.6179 |
|
79 |
+
| 0.19 | 6.37e-05 | 0.6042 |
|
80 |
+
| 0.21 | 6.86e-05 | 0.6138 |
|
81 |
+
| 0.22 | 7.35e-05 | 0.5940 |
|
82 |
+
| 0.24 | 7.84e-05 | 0.5838 |
|
83 |
+
| 0.25 | 8.33e-05 | 0.5842 |
|
84 |
+
| 0.26 | 8.82e-05 | 0.5786 |
|
85 |
+
| 0.28 | 9.31e-05 | 0.5713 |
|
86 |
+
| 0.65 | 9.60e-05 | 0.6122 |
|
87 |
+
| 0.71 | 9.45e-05 | 0.5809 |
|
88 |
+
| 0.77 | 9.29e-05 | 0.5446 |
|
89 |
+
| 0.82 | 9.10e-05 | 0.5339 |
|
90 |
+
| 0.88 | 8.90e-05 | 0.5296 |
|
91 |
+
| 0.94 | 8.67e-05 | 0.5176 |
|
92 |
+
| 1.00 | 8.43e-05 | 0.5104 |
|
93 |
+
| 1.06 | 8.17e-05 | 0.4685 |
|
94 |
+
| 1.12 | 7.90e-05 | 0.4730 |
|
95 |
+
| 1.18 | 7.62e-05 | 0.4679 |
|
96 |
+
| 1.24 | 7.32e-05 | 0.4541 |
|
97 |
+
| 1.30 | 7.01e-05 | 0.4576 |
|
98 |
+
| 1.35 | 6.69e-05 | 0.4472 |
|
99 |
+
| 1.41 | 6.36e-05 | 0.4427 |
|
100 |
+
| 1.47 | 6.03e-05 | 0.4395 |
|
101 |
+
| 1.53 | 5.69e-05 | 0.4305 |
|
102 |
+
| 1.59 | 5.35e-05 | 0.4280 |
|
103 |
+
| 1.65 | 5.01e-05 | 0.4251 |
|
104 |
+
| 1.71 | 4.67e-05 | 0.4188 |
|
105 |
+
| 1.77 | 4.33e-05 | 0.4177 |
|
106 |
+
| 1.83 | 3.99e-05 | 0.4128 |
|
107 |
+
|
108 |
+
**Evaluation Losses:**
|
109 |
+
|
110 |
+
- ๐ Epoch 1.18 โ `0.4845`
|
111 |
+
- ๐ Epoch 1.77 โ `0.4321`
|