kurakurai
/

Luth-1.7B-Instruct-GGUF

+---
+library_name: transformers
+license: apache-2.0
+datasets:
+- kurakurai/luth-sft
+language:
+- fr
+- en
+base_model:
+- Qwen/Qwen3-0.6B
+pipeline_tag: text-generation
+---
+![Kurakura AI Logo](media/logo_kurakura.png)
+---
+# Luth-0.6B-Instruct
+**Luth-0.6B-Instruct** is a French fine-tuned version of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
+Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote.
+## Model Details
+Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-0.6B model. This process successfully retained the model's English capabilities while improving its performance on nearly all selected benchmarks in both French and English.
+## Benchmark Results
+We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
+### Evaluation Visualizations
+**French Evaluation:**
+![French Evaluation](media/french_evaluation.png)
+**English Evaluation:**
+![English Evaluation](media/english_evaluation.png)
+### French Benchmark Scores
+| Benchmark         | Qwen3-0.6B       | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct |
+|-------------------|------------------|-----------------------|-----------------|
+| ifeval-fr         | 44.45            | 22.18                 | <u>48.24</u>    |
+| gpqa-diamond-fr   | 28.93            | 23.86                 | <u>33.50</u>    |
+| mmlu-fr           | 27.16            | 35.04                 | <u>40.23</u>    |
+| math-500-fr       | 29.20            | 10.00                 | <u>43.00</u>    |
+| arc-chall-fr      | 31.31            | 28.23                 | <u>33.88</u>    |
+| hellaswag-fr      | 25.11            | <u>51.45</u>          | 45.70           |
+### English Benchmark Scores
+| Benchmark         | Qwen3-0.6B       | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct   |
+|-------------------|------------------|-----------------------|-----------------|
+| ifeval-en         | <u>57.86</u>     | 29.21                 | 53.97           |
+| gpqa-diamond-en   | <u>29.80</u>     | 26.77                 | 28.28           |
+| mmlu-en           | 36.85            | 43.80                 | <u>48.10</u>    |
+| math-500-en       | 45.00            | 31.80                 | <u>47.80</u>    |
+| arc-chall-en      | 33.62            | 32.17                 | <u>35.92</u>    |
+| hellaswag-en      | 42.91            | <u>49.56</u>          | 46.96           |
+## Citation
+```bibtex
+@misc{luth2025kurakurai,
+  title   = {Luth-0.6B-Instruct},
+  author  = {Kurakura AI Team},
+  year    = {2025},
+  howpublished = {\url{https://huggingface.co/kurakurai/Luth-0.6B}},
+  note    = {Qwen3-0.6B fine-tuned on French datasets}
+}
+```