Files changed (1) hide show
  1. README.md +107 -96
README.md CHANGED
@@ -1,96 +1,107 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - text-generation-inference
5
- - Math
6
- - Code
7
- - Thinker
8
- license: apache-2.0
9
- language:
10
- - en
11
- - zh
12
- base_model:
13
- - Qwen/Qwen2.5-1.5B-Instruct
14
- pipeline_tag: text-generation
15
- ---
16
-
17
- ![Thinker.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/fAOdz1WFMBNJdQM2UNEBe.png)
18
-
19
- # **Gamma-Velorum-1.5B-Thinker**
20
-
21
- > **Gamma-Velorum-1.5B-Thinker** is a **math and code reasoning model** fine-tuned from **Qwen2.5-1.5B**, crafted to tackle complex **mathematical** and **programming** problems using **chain-of-thought** methodology. It excels in **step-by-step explanations**, long-context understanding, and bilingual support — ideal for education, coding tutors, and logic-intensive applications.
22
-
23
- ## **Key Features**
24
-
25
- 1. **Math + Code Chain-of-Thought Reasoning**
26
- Trained to provide detailed, structured steps for both **mathematical** and **coding** problems. Gamma-Velorum-1.5B-Thinker explains not just the what, but the *why*, ensuring clarity in logic and computation.
27
-
28
- 2. **Backed by Qwen2.5-1.5B**
29
- Built on the latest Qwen2.5 architecture, bringing improved accuracy, reasoning capabilities, and enhanced tokenizer efficiency.
30
-
31
- 3. **Long-Context Problem Solving**
32
- Capable of handling **long multi-turn questions**, nested logic, and extended code/math scenarios — ideal for competitive exams or coding challenges.
33
-
34
- 4. **Bilingual (English + Chinese)**
35
- Seamlessly understands and reasons through prompts in both **English** and **Simplified Chinese**, making it versatile for global education platforms.
36
-
37
- 5. **Efficient and Lightweight**
38
- With only 1.5B parameters, it strikes a balance between **performance and deployability**, suitable for web, edge, and mobile environments.
39
-
40
- ## **Quickstart with Transformers**
41
-
42
- ```python
43
- from transformers import AutoModelForCausalLM, AutoTokenizer
44
-
45
- model_name = "prithivMLmods/Gamma-Velorum-1.5B-Thinker"
46
-
47
- model = AutoModelForCausalLM.from_pretrained(
48
- model_name,
49
- torch_dtype="auto",
50
- device_map="auto"
51
- )
52
- tokenizer = AutoTokenizer.from_pretrained(model_name)
53
-
54
- prompt = "Write a Python function to calculate factorial of a number."
55
- messages = [
56
- {"role": "system", "content": "You are a helpful tutor skilled in math and programming. Explain solutions step-by-step."},
57
- {"role": "user", "content": prompt}
58
- ]
59
- text = tokenizer.apply_chat_template(
60
- messages,
61
- tokenize=False,
62
- add_generation_prompt=True
63
- )
64
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
65
-
66
- generated_ids = model.generate(
67
- **model_inputs,
68
- max_new_tokens=512
69
- )
70
- generated_ids = [
71
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
72
- ]
73
-
74
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
75
- ```
76
-
77
- ## **Intended Use**
78
-
79
- - **Math & Coding Tutors**: Solves word problems, algebra, logic puzzles, and programming challenges with clarity and precision.
80
- - **Bilingual EdTech Apps**: Explains both math and code in English and Chinese for a broader learning reach.
81
- - **STEM Reasoning Engines**: Powers scientific reasoning tools, code-assist bots, and step-by-step logic solvers.
82
- - **Lightweight LLM Use Cases**: Browser-based, embedded systems, or mobile apps for learners and developers.
83
-
84
- ## **Limitations**
85
-
86
- 1. **Domain Focused**:
87
- Optimized for **STEM and code** tasks — general conversation or abstract creative writing may not be as strong.
88
-
89
- 2. **Scale Limitations**:
90
- As a 1.5B parameter model, it may not match larger models on highly complex logic or long-form generation.
91
-
92
- 3. **Bias Inheritance**:
93
- Carries forward biases from its Qwen2.5 base model important for sensitive contexts.
94
-
95
- 4. **Prompt Structuring Matters**:
96
- Performs best with explicit, structured prompts for math/code. Ambiguous or casual phrasing may reduce accuracy.
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - text-generation-inference
5
+ - Math
6
+ - Code
7
+ - Thinker
8
+ license: apache-2.0
9
+ language:
10
+ - zho
11
+ - eng
12
+ - fra
13
+ - spa
14
+ - por
15
+ - deu
16
+ - ita
17
+ - rus
18
+ - jpn
19
+ - kor
20
+ - vie
21
+ - tha
22
+ - ara
23
+ base_model:
24
+ - Qwen/Qwen2.5-1.5B-Instruct
25
+ pipeline_tag: text-generation
26
+ ---
27
+
28
+ ![Thinker.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/fAOdz1WFMBNJdQM2UNEBe.png)
29
+
30
+ # **Gamma-Velorum-1.5B-Thinker**
31
+
32
+ > **Gamma-Velorum-1.5B-Thinker** is a **math and code reasoning model** fine-tuned from **Qwen2.5-1.5B**, crafted to tackle complex **mathematical** and **programming** problems using **chain-of-thought** methodology. It excels in **step-by-step explanations**, long-context understanding, and bilingual support — ideal for education, coding tutors, and logic-intensive applications.
33
+
34
+ ## **Key Features**
35
+
36
+ 1. **Math + Code Chain-of-Thought Reasoning**
37
+ Trained to provide detailed, structured steps for both **mathematical** and **coding** problems. Gamma-Velorum-1.5B-Thinker explains not just the what, but the *why*, ensuring clarity in logic and computation.
38
+
39
+ 2. **Backed by Qwen2.5-1.5B**
40
+ Built on the latest Qwen2.5 architecture, bringing improved accuracy, reasoning capabilities, and enhanced tokenizer efficiency.
41
+
42
+ 3. **Long-Context Problem Solving**
43
+ Capable of handling **long multi-turn questions**, nested logic, and extended code/math scenarios — ideal for competitive exams or coding challenges.
44
+
45
+ 4. **Bilingual (English + Chinese)**
46
+ Seamlessly understands and reasons through prompts in both **English** and **Simplified Chinese**, making it versatile for global education platforms.
47
+
48
+ 5. **Efficient and Lightweight**
49
+ With only 1.5B parameters, it strikes a balance between **performance and deployability**, suitable for web, edge, and mobile environments.
50
+
51
+ ## **Quickstart with Transformers**
52
+
53
+ ```python
54
+ from transformers import AutoModelForCausalLM, AutoTokenizer
55
+
56
+ model_name = "prithivMLmods/Gamma-Velorum-1.5B-Thinker"
57
+
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ model_name,
60
+ torch_dtype="auto",
61
+ device_map="auto"
62
+ )
63
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
64
+
65
+ prompt = "Write a Python function to calculate factorial of a number."
66
+ messages = [
67
+ {"role": "system", "content": "You are a helpful tutor skilled in math and programming. Explain solutions step-by-step."},
68
+ {"role": "user", "content": prompt}
69
+ ]
70
+ text = tokenizer.apply_chat_template(
71
+ messages,
72
+ tokenize=False,
73
+ add_generation_prompt=True
74
+ )
75
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
76
+
77
+ generated_ids = model.generate(
78
+ **model_inputs,
79
+ max_new_tokens=512
80
+ )
81
+ generated_ids = [
82
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
83
+ ]
84
+
85
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
86
+ ```
87
+
88
+ ## **Intended Use**
89
+
90
+ - **Math & Coding Tutors**: Solves word problems, algebra, logic puzzles, and programming challenges with clarity and precision.
91
+ - **Bilingual EdTech Apps**: Explains both math and code in English and Chinese for a broader learning reach.
92
+ - **STEM Reasoning Engines**: Powers scientific reasoning tools, code-assist bots, and step-by-step logic solvers.
93
+ - **Lightweight LLM Use Cases**: Browser-based, embedded systems, or mobile apps for learners and developers.
94
+
95
+ ## **Limitations**
96
+
97
+ 1. **Domain Focused**:
98
+ Optimized for **STEM and code** tasks — general conversation or abstract creative writing may not be as strong.
99
+
100
+ 2. **Scale Limitations**:
101
+ As a 1.5B parameter model, it may not match larger models on highly complex logic or long-form generation.
102
+
103
+ 3. **Bias Inheritance**:
104
+ Carries forward biases from its Qwen2.5 base model — important for sensitive contexts.
105
+
106
+ 4. **Prompt Structuring Matters**:
107
+ Performs best with explicit, structured prompts for math/code. Ambiguous or casual phrasing may reduce accuracy.