GAD-cell commited on
Commit
9178781
·
verified ·
1 Parent(s): 754e5cf

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ datasets:
5
+ - kurakurai/luth-sft
6
+ language:
7
+ - fr
8
+ - en
9
+ base_model:
10
+ - Qwen/Qwen3-0.6B
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ ![Kurakura AI Logo](media/logo_kurakura.png)
15
+
16
+ ---
17
+
18
+ # Luth-0.6B-Instruct
19
+
20
+ **Luth-0.6B-Instruct** is a French fine-tuned version of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
21
+
22
+ Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote.
23
+
24
+ ## Model Details
25
+
26
+ Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-0.6B model. This process successfully retained the model's English capabilities while improving its performance on nearly all selected benchmarks in both French and English.
27
+
28
+ ## Benchmark Results
29
+
30
+ We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
31
+
32
+ ### Evaluation Visualizations
33
+
34
+ **French Evaluation:**
35
+
36
+ ![French Evaluation](media/french_evaluation.png)
37
+
38
+ **English Evaluation:**
39
+
40
+ ![English Evaluation](media/english_evaluation.png)
41
+
42
+ ### French Benchmark Scores
43
+
44
+ | Benchmark | Qwen3-0.6B | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct |
45
+ |-------------------|------------------|-----------------------|-----------------|
46
+ | ifeval-fr | 44.45 | 22.18 | <u>48.24</u> |
47
+ | gpqa-diamond-fr | 28.93 | 23.86 | <u>33.50</u> |
48
+ | mmlu-fr | 27.16 | 35.04 | <u>40.23</u> |
49
+ | math-500-fr | 29.20 | 10.00 | <u>43.00</u> |
50
+ | arc-chall-fr | 31.31 | 28.23 | <u>33.88</u> |
51
+ | hellaswag-fr | 25.11 | <u>51.45</u> | 45.70 |
52
+
53
+ ### English Benchmark Scores
54
+
55
+ | Benchmark | Qwen3-0.6B | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct |
56
+ |-------------------|------------------|-----------------------|-----------------|
57
+ | ifeval-en | <u>57.86</u> | 29.21 | 53.97 |
58
+ | gpqa-diamond-en | <u>29.80</u> | 26.77 | 28.28 |
59
+ | mmlu-en | 36.85 | 43.80 | <u>48.10</u> |
60
+ | math-500-en | 45.00 | 31.80 | <u>47.80</u> |
61
+ | arc-chall-en | 33.62 | 32.17 | <u>35.92</u> |
62
+ | hellaswag-en | 42.91 | <u>49.56</u> | 46.96 |
63
+
64
+
65
+ ## Citation
66
+
67
+ ```bibtex
68
+ @misc{luth2025kurakurai,
69
+ title = {Luth-0.6B-Instruct},
70
+ author = {Kurakura AI Team},
71
+ year = {2025},
72
+ howpublished = {\url{https://huggingface.co/kurakurai/Luth-0.6B}},
73
+ note = {Qwen3-0.6B fine-tuned on French datasets}
74
+ }
75
+ ```