lbourdois commited on
Commit
f435013
·
verified ·
1 Parent(s): 5b209bc

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +107 -95
README.md CHANGED
@@ -1,96 +1,108 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - math
5
- - cot
6
- - text-generation-inference
7
- - preview
8
- - experimental
9
- license: apache-2.0
10
- language:
11
- - en
12
- base_model:
13
- - Qwen/Qwen2.5-1.5B-Instruct
14
- pipeline_tag: text-generation
15
- ---
16
-
17
- ![DMC.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/WYWprTh49LUnIw-HiTcU-.png)
18
-
19
- # **Deepmath-Competitive-1.5B-Preview**
20
-
21
- > **Deepmath-Competitive-1.5B-Preview** is a **chain-of-thought reasoning model** fine-tuned from **Qwen-1.5B**, purpose-built for solving **mathematical problems** in both **English** and **Chinese** with a focus on **long-context understanding**. It enables advanced reasoning and detailed step-by-step problem solving in a compact form — ideal for competitive exam preparation, tutoring systems, and math-focused AI assistants.
22
-
23
- ## **Key Features**
24
-
25
- 1. **Chain-of-Thought Math Reasoning**
26
- Specifically trained to output detailed intermediate steps for math problems, Deepmath-Competitive-1.5B-Preview ensures interpretability and logical clarity — vital for learning and validation.
27
-
28
- 2. **Bilingual Proficiency (English + Chinese)**
29
- Proficient in understanding and solving math problems in **both English and Simplified Chinese**, supporting diverse educational needs.
30
-
31
- 3. **Long-Context Reasoning**
32
- Optimized for **long-form math problems** and word problem comprehension, enabling reasoning over extended contexts and compound queries.
33
-
34
- 4. **Compact yet Powerful**
35
- With just 1.5B parameters, it delivers robust performance on arithmetic, algebra, geometry, logic, and competitive exam-style word problems with minimal computational cost.
36
-
37
- 5. **Structured Step-by-Step Computation**
38
- Produces clean, stepwise outputs that mimic expert human problem-solving, helping learners follow the process and logic intuitively.
39
-
40
- ## **Quickstart with Transformers**
41
-
42
- ```python
43
- from transformers import AutoModelForCausalLM, AutoTokenizer
44
-
45
- model_name = "prithivMLmods/Deepmath-Competitive-1.5B-Preview"
46
-
47
- model = AutoModelForCausalLM.from_pretrained(
48
- model_name,
49
- torch_dtype="auto",
50
- device_map="auto"
51
- )
52
- tokenizer = AutoTokenizer.from_pretrained(model_name)
53
-
54
- prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
55
- messages = [
56
- {"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
57
- {"role": "user", "content": prompt}
58
- ]
59
- text = tokenizer.apply_chat_template(
60
- messages,
61
- tokenize=False,
62
- add_generation_prompt=True
63
- )
64
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
65
-
66
- generated_ids = model.generate(
67
- **model_inputs,
68
- max_new_tokens=512
69
- )
70
- generated_ids = [
71
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
72
- ]
73
-
74
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
75
- ```
76
-
77
- ## **Intended Use**
78
-
79
- - **Math Tutoring Bots**: Delivers in-depth, multi-step solutions for students preparing for competitive and school-level math.
80
- - **Bilingual Educational Apps**: Effective in English and Chinese teaching environments.
81
- - **STEM Reasoning Tools**: Supports structured reasoning across science and engineering questions.
82
- - **Compact LLM Deployments**: Suitable for low-latency environments like mobile apps, edge devices, or web integrations.
83
-
84
- ## **Limitations**
85
-
86
- 1. **Domain Focus**:
87
- Primarily tuned for mathematics; performance may drop outside STEM or logical domains.
88
-
89
- 2. **Model Scale**:
90
- While efficient, it may underperform on abstract or research-level problems compared to larger models.
91
-
92
- 3. **Inherited Biases**:
93
- As a fine-tune of Qwen-1.5B, some pretraining biases may persist. Review is advised in critical applications.
94
-
95
- 4. **Prompt Sensitivity**:
 
 
 
 
 
 
 
 
 
 
 
 
96
  Performs best with clearly structured prompts and formal question phrasing.
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - math
5
+ - cot
6
+ - text-generation-inference
7
+ - preview
8
+ - experimental
9
+ license: apache-2.0
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ base_model:
25
+ - Qwen/Qwen2.5-1.5B-Instruct
26
+ pipeline_tag: text-generation
27
+ ---
28
+
29
+ ![DMC.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/WYWprTh49LUnIw-HiTcU-.png)
30
+
31
+ # **Deepmath-Competitive-1.5B-Preview**
32
+
33
+ > **Deepmath-Competitive-1.5B-Preview** is a **chain-of-thought reasoning model** fine-tuned from **Qwen-1.5B**, purpose-built for solving **mathematical problems** in both **English** and **Chinese** with a focus on **long-context understanding**. It enables advanced reasoning and detailed step-by-step problem solving in a compact form — ideal for competitive exam preparation, tutoring systems, and math-focused AI assistants.
34
+
35
+ ## **Key Features**
36
+
37
+ 1. **Chain-of-Thought Math Reasoning**
38
+ Specifically trained to output detailed intermediate steps for math problems, Deepmath-Competitive-1.5B-Preview ensures interpretability and logical clarity — vital for learning and validation.
39
+
40
+ 2. **Bilingual Proficiency (English + Chinese)**
41
+ Proficient in understanding and solving math problems in **both English and Simplified Chinese**, supporting diverse educational needs.
42
+
43
+ 3. **Long-Context Reasoning**
44
+ Optimized for **long-form math problems** and word problem comprehension, enabling reasoning over extended contexts and compound queries.
45
+
46
+ 4. **Compact yet Powerful**
47
+ With just 1.5B parameters, it delivers robust performance on arithmetic, algebra, geometry, logic, and competitive exam-style word problems with minimal computational cost.
48
+
49
+ 5. **Structured Step-by-Step Computation**
50
+ Produces clean, stepwise outputs that mimic expert human problem-solving, helping learners follow the process and logic intuitively.
51
+
52
+ ## **Quickstart with Transformers**
53
+
54
+ ```python
55
+ from transformers import AutoModelForCausalLM, AutoTokenizer
56
+
57
+ model_name = "prithivMLmods/Deepmath-Competitive-1.5B-Preview"
58
+
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ model_name,
61
+ torch_dtype="auto",
62
+ device_map="auto"
63
+ )
64
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
65
+
66
+ prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
67
+ messages = [
68
+ {"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
69
+ {"role": "user", "content": prompt}
70
+ ]
71
+ text = tokenizer.apply_chat_template(
72
+ messages,
73
+ tokenize=False,
74
+ add_generation_prompt=True
75
+ )
76
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
77
+
78
+ generated_ids = model.generate(
79
+ **model_inputs,
80
+ max_new_tokens=512
81
+ )
82
+ generated_ids = [
83
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
84
+ ]
85
+
86
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
87
+ ```
88
+
89
+ ## **Intended Use**
90
+
91
+ - **Math Tutoring Bots**: Delivers in-depth, multi-step solutions for students preparing for competitive and school-level math.
92
+ - **Bilingual Educational Apps**: Effective in English and Chinese teaching environments.
93
+ - **STEM Reasoning Tools**: Supports structured reasoning across science and engineering questions.
94
+ - **Compact LLM Deployments**: Suitable for low-latency environments like mobile apps, edge devices, or web integrations.
95
+
96
+ ## **Limitations**
97
+
98
+ 1. **Domain Focus**:
99
+ Primarily tuned for mathematics; performance may drop outside STEM or logical domains.
100
+
101
+ 2. **Model Scale**:
102
+ While efficient, it may underperform on abstract or research-level problems compared to larger models.
103
+
104
+ 3. **Inherited Biases**:
105
+ As a fine-tune of Qwen-1.5B, some pretraining biases may persist. Review is advised in critical applications.
106
+
107
+ 4. **Prompt Sensitivity**:
108
  Performs best with clearly structured prompts and formal question phrasing.