suayptalha commited on
Commit
559ef38
·
verified ·
1 Parent(s): d396a45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -12,4 +12,34 @@ base_model:
12
  - Qwen/Qwen3-0.6B
13
  pipeline_tag: text-generation
14
  library_name: transformers
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - Qwen/Qwen3-0.6B
13
  pipeline_tag: text-generation
14
  library_name: transformers
15
+ ---
16
+
17
+ # Qwen3-0.6B-Math-Expert
18
+
19
+ This project performs full fine-tuning on the **Qwen3-0.6B** language model to enhance its mathematical problem-solving and reasoning capabilities. Training was conducted exclusively on the `OpenMathReasoning-mini` dataset, and the model was optimized using the bfloat16 (bf16) data type.
20
+
21
+ ## Training Procedure
22
+
23
+ 1. **Dataset Preparation**
24
+
25
+ * The `unsloth/OpenMathReasoning-mini` dataset was used.
26
+ * Each example was formatted in Chain-of-Thought (CoT) style, pairing math problems with step-by-step intermediate reasoning.
27
+
28
+ 2. **Model Loading and Configuration**
29
+
30
+ * Qwen3 base model weights were loaded via the `unsloth` library in bf16 precision.
31
+ * All layers were updated (`full_finetuning=True`) to adapt the model for mathematical reasoning.
32
+
33
+ 3. **Supervised Fine-Tuning**
34
+
35
+ * Leveraged the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
36
+ * The model was trained to generate both correct answers and corresponding reasoning chains.
37
+
38
+ ## Purpose and Outcome
39
+
40
+ * The model’s reasoning capacity for math problems was significantly improved through single-dataset, full fine-tuning in bf16 precision.
41
+ * Outputs include both intermediate reasoning steps and final solutions, providing transparent and interpretable results.
42
+
43
+ ## License
44
+
45
+ This project is licensed under the Apache License 2.0. See the [LICENSE](./LICENSE) file for details.