Improve language tag

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show

README.md +47 -35

README.md CHANGED Viewed

@@ -1,36 +1,48 @@
----
-license: mit
-datasets:
-- isaiahbjork/chain-of-thought
-base_model:
-- Qwen/Qwen2.5-3B-Instruct
-library_name: mlx
-language:
-- en
-pipeline_tag: text-generation
----
-## Model Overview
-This model is a fine-tuned version of the Qwen2.5-3B base model, enhanced using Low-Rank Adaptation (LoRA) techniques via the MLX framework. The fine-tuning process utilized the isaiahbjork/chain-of-thought dataset, comprising 7,143 examples, over 600 iterations. This enhancement aims to improve the model's performance in tasks requiring multi-step reasoning and problem-solving.
-## Model Architecture
- - Base Model: Qwen2.5-3B
- - Model Type: Causal Language Model
- - Architecture: Transformer with Rotary Position Embedding (RoPE),
-   SwiGLU activation, RMSNorm normalization, attention QKV bias, and tied word embeddings
- - Parameters: 3.09 billion
- - Layers: 36
- - Attention Heads: 16 for query, 2 for key and value (GQA)
-## Fine-Tuning Details
- - Technique: Low-Rank Adaptation (LoRA)
- - Framework: MLX
- - Dataset: isaiahbjork/chain-of-thought
- - Dataset Size: 7,143 examples
- - Iterations: 600
 LoRA was employed to efficiently fine-tune the model by adjusting a subset of parameters, reducing computational requirements while maintaining performance. The MLX framework facilitated this process, leveraging Apple silicon hardware for optimized training.

+---
+license: mit
+datasets:
+- isaiahbjork/chain-of-thought
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
+library_name: mlx
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+pipeline_tag: text-generation
+---
+## Model Overview
+This model is a fine-tuned version of the Qwen2.5-3B base model, enhanced using Low-Rank Adaptation (LoRA) techniques via the MLX framework. The fine-tuning process utilized the isaiahbjork/chain-of-thought dataset, comprising 7,143 examples, over 600 iterations. This enhancement aims to improve the model's performance in tasks requiring multi-step reasoning and problem-solving.
+## Model Architecture
+ - Base Model: Qwen2.5-3B
+ - Model Type: Causal Language Model
+ - Architecture: Transformer with Rotary Position Embedding (RoPE),
+   SwiGLU activation, RMSNorm normalization, attention QKV bias, and tied word embeddings
+ - Parameters: 3.09 billion
+ - Layers: 36
+ - Attention Heads: 16 for query, 2 for key and value (GQA)
+## Fine-Tuning Details
+ - Technique: Low-Rank Adaptation (LoRA)
+ - Framework: MLX
+ - Dataset: isaiahbjork/chain-of-thought
+ - Dataset Size: 7,143 examples
+ - Iterations: 600
 LoRA was employed to efficiently fine-tune the model by adjusting a subset of parameters, reducing computational requirements while maintaining performance. The MLX framework facilitated this process, leveraging Apple silicon hardware for optimized training.