youssefedweqd commited on
Commit
7b96ef1
ยท
verified ยท
1 Parent(s): 503fafc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -9
README.md CHANGED
@@ -1,12 +1,111 @@
1
  ---
2
- license: mit
3
  language:
4
- - ar
5
- base_model:
6
- - Qwen/Qwen2.5-1.5B
 
 
 
 
 
 
 
 
 
 
 
7
  pipeline_tag: text-generation
8
- metrics:
9
- - bertscore
10
- new_version: Qwen/Qwen2.5-1.5B
11
- library_name: adapter-transformers
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
+ - ar
4
+ license: apache-2.0
5
+ tags:
6
+ - qwen
7
+ - llama-factory
8
+ - lora
9
+ - arabic
10
+ - question-answering
11
+ - instruction-tuning
12
+ - kaggle
13
+ - transformers
14
+ - fine-tuned
15
+ model_name: QWEN_Arabic_Q&A
16
+ base_model: Qwen/Qwen2.5-1.5B
17
  pipeline_tag: text-generation
18
+ library_name: transformers
19
+ datasets:
20
+ - custom
21
+ ---
22
+
23
+ # ๐Ÿง  Qwen2.5-1.5B - LoRA Fine-Tuned on Arabic Q&A ๐Ÿ•Œ
24
+
25
+ This model is a LoRA fine-tuned version of **[Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B)** designed for Arabic Question Answering tasks. It was trained using the **LLaMA-Factory** framework on a custom curated dataset of Arabic Q&A pairs.
26
+
27
+ ## ๐Ÿ“š Training Configuration
28
+
29
+ - **Base Model**: `Qwen/Qwen2.5-1.5B`
30
+ - **Method**: Supervised Fine-Tuning (SFT) with [LoRA](https://arxiv.org/abs/2106.09685)
31
+ - **Framework**: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
32
+ - **Batch Size**: 1 (gradient accumulation = 16)
33
+ - **Epochs**: 3
34
+ - **Cutoff Length**: 2048 tokens
35
+ - **Learning Rate**: 1e-4
36
+ - **Scheduler**: Cosine with warmup ratio 0.1
37
+ - **Precision**: bf16
38
+ - **LoRA Rank**: 64
39
+ - **LoRA Target**: all layers
40
+ - **Eval Strategy**: every 200 steps
41
+ - **Eval Set Size**: 3020 examples
42
+ - **WandB Tracking**: Enabled [`Run Link`](https://wandb.ai/youssefhassan437972-kafr-el-sheikh-university/llamafactory/runs/rdrftts8)
43
+
44
+ ## ๐Ÿ“ˆ Evaluation (Epoch ~1.77)
45
+
46
+ - **Eval Loss**: 0.4321
47
+ - **Samples/sec**: 1.389
48
+ - **Steps/sec**: 0.695
49
+
50
+ ## ๐Ÿš€ Usage
51
+
52
+ You can use the model via `transformers`:
53
+
54
+ ```python
55
+ from transformers import AutoModelForCausalLM, AutoTokenizer
56
+
57
+ model = AutoModelForCausalLM.from_pretrained("Youssef/QWEN_Arabic_Q&A")
58
+ tokenizer = AutoTokenizer.from_pretrained("Youssef/QWEN_Arabic_Q&A")
59
+
60
+ prompt = "ู…ู† ู‡ูˆ ู…ุคุณุณ ุนู„ู… ุงู„ุฌุจุฑุŸ"
61
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
62
+ outputs = model.generate(**inputs, max_new_tokens=100)
63
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
64
+
65
+
66
+ <|user|>
67
+ ู…ุง ู‡ูŠ ุฃุฑูƒุงู† ุงู„ุฅุณู„ุงู… ู…ุน ุฐูƒุฑ ุงู„ุญุฏูŠุซ ุงู„ุฐูŠ ูŠุฐูƒุฑู‡ุงุŸ
68
+ <|assistant|>
69
+ ุฃุฑูƒุงู† ุงู„ุฅุณู„ุงู… ุฎู…ุณุฉุŒ ูƒู…ุง ุฌุงุก ููŠ ุงู„ุญุฏูŠุซ ุงู„ุตุญูŠุญ:
70
+
71
+ ุนู† ุนุจุฏ ุงู„ู„ู‡ ุจู† ุนู…ุฑ ุฑุถูŠ ุงู„ู„ู‡ ุนู†ู‡ู…ุง ู‚ุงู„: ู‚ุงู„ ุฑุณูˆู„ ุงู„ู„ู‡ ๏ทบ: "ุจู†ูŠ ุงู„ุฅุณู„ุงู… ุนู„ู‰ ุฎู…ุณ: ุดู‡ุงุฏุฉ ุฃู† ู„ุง ุฅู„ู‡ ุฅู„ุง ุงู„ู„ู‡ุŒ ูˆุฃู† ู…ุญู…ุฏู‹ุง ุฑุณูˆู„ ุงู„ู„ู‡ุŒ ูˆุฅู‚ุงู… ุงู„ุตู„ุงุฉุŒ ูˆุฅูŠุชุงุก ุงู„ุฒูƒุงุฉุŒ ูˆุตูˆู… ุฑู…ุถุงู†ุŒ ูˆุญุฌ ุงู„ุจูŠุช ู„ู…ู† ุงุณุชุทุงุน ุฅู„ูŠู‡ ุณุจูŠู„ู‹ุง" (ุฑูˆุงู‡ ุงู„ุจุฎุงุฑูŠ ูˆู…ุณู„ู…).
72
+
73
+ ## ๐Ÿ“‰ Training Loss Over Epochs
74
+
75
+ | Epoch | Learning Rate | Loss |
76
+ |-------|------------------------|--------|
77
+ | 0.16 | 5.39e-05 | 0.6304 |
78
+ | 0.18 | 5.88e-05 | 0.6179 |
79
+ | 0.19 | 6.37e-05 | 0.6042 |
80
+ | 0.21 | 6.86e-05 | 0.6138 |
81
+ | 0.22 | 7.35e-05 | 0.5940 |
82
+ | 0.24 | 7.84e-05 | 0.5838 |
83
+ | 0.25 | 8.33e-05 | 0.5842 |
84
+ | 0.26 | 8.82e-05 | 0.5786 |
85
+ | 0.28 | 9.31e-05 | 0.5713 |
86
+ | 0.65 | 9.60e-05 | 0.6122 |
87
+ | 0.71 | 9.45e-05 | 0.5809 |
88
+ | 0.77 | 9.29e-05 | 0.5446 |
89
+ | 0.82 | 9.10e-05 | 0.5339 |
90
+ | 0.88 | 8.90e-05 | 0.5296 |
91
+ | 0.94 | 8.67e-05 | 0.5176 |
92
+ | 1.00 | 8.43e-05 | 0.5104 |
93
+ | 1.06 | 8.17e-05 | 0.4685 |
94
+ | 1.12 | 7.90e-05 | 0.4730 |
95
+ | 1.18 | 7.62e-05 | 0.4679 |
96
+ | 1.24 | 7.32e-05 | 0.4541 |
97
+ | 1.30 | 7.01e-05 | 0.4576 |
98
+ | 1.35 | 6.69e-05 | 0.4472 |
99
+ | 1.41 | 6.36e-05 | 0.4427 |
100
+ | 1.47 | 6.03e-05 | 0.4395 |
101
+ | 1.53 | 5.69e-05 | 0.4305 |
102
+ | 1.59 | 5.35e-05 | 0.4280 |
103
+ | 1.65 | 5.01e-05 | 0.4251 |
104
+ | 1.71 | 4.67e-05 | 0.4188 |
105
+ | 1.77 | 4.33e-05 | 0.4177 |
106
+ | 1.83 | 3.99e-05 | 0.4128 |
107
+
108
+ **Evaluation Losses:**
109
+
110
+ - ๐Ÿ“ Epoch 1.18 โ†’ `0.4845`
111
+ - ๐Ÿ“ Epoch 1.77 โ†’ `0.4321`