File size: 5,814 Bytes
9f2b8d6
2e3f81b
 
 
 
 
 
 
 
 
 
8ed7c2b
 
2e3f81b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: mit
language:
- en
base_model:
- microsoft/phi-4
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
- math
---

![2.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/QcOUgFsZBSnVHBcY6GJKU.png)

Here's the updated `README.md` with the requested changes:

---

# **Phi-4 o1 [ Responsible Mathematical Problem Solving & Reasoning Capabilities ]**

`Phi-4 o1 [ Responsible Mathematical Problem Solving & Reasoning Capabilities ]` is a state-of-the-art open model fine-tuned on advanced reasoning tasks. It is based on **Microsoft’s Phi-4**, built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The primary focus is to create a small, capable model that excels in **responsible reasoning** and **mathematical problem-solving** with high-quality data.

The **Phi-4 o1** model has undergone robust safety post-training using a combination of **SFT (Supervised Fine-Tuning)** and iterative **DPO (Direct Preference Optimization)** techniques. The safety alignment process includes publicly available datasets and proprietary synthetic datasets to improve **helpfulness**, **harmlessness**, and **responsible AI usage**.

---

## **Dataset Info**

Phi-4 o1 ft is fine-tuned on a synthetic dataset curated through a specially designed pipeline. The dataset leverages the **Math IO (Input-Output)** methodology and step-by-step problem-solving approaches. This ensures the model is highly effective in:

- **Responsible mathematical problem-solving**  
- **Logical reasoning**  
- **Stepwise breakdowns of complex tasks**  

The dataset design focuses on enabling the model to generate detailed, accurate, and logically coherent solutions for mathematical and reasoning-based tasks.

---

## **Run with Transformers**

To use Phi-4 o1 ft for text generation tasks, follow the example below:

### Example Usage

```python
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Phi-4-Math-IO")
model = AutoModelForCausalLM.from_pretrained(
    "prithivMLmods/Phi-4-Math-IO",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# Input prompt
input_text = "Solve the equation: 2x + 3 = 11. Provide a stepwise solution."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

# Generate output
outputs = model.generate(**input_ids, max_new_tokens=64)
print(tokenizer.decode(outputs[0]))
```

For structured dialogue generation, you can apply the chat template as follows:

```python
# Structured input for chat-style interaction
messages = [
    {"role": "user", "content": "Explain Pythagoras’ theorem with an example."},
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to("cuda")

# Generate response
outputs = model.generate(**input_ids, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))
```
---
## **Intended Use**

Phi-4 o1 ft is designed for a wide range of **reasoning-intensive** and **math-focused** applications. Below are some key use cases:

### 1. **Responsible Mathematical Problem Solving**  
   - Solving complex mathematical problems with detailed, step-by-step solutions.  
   - Assisting students, educators, and researchers in understanding advanced mathematical concepts.  

### 2. **Reasoning and Logical Problem Solving**  
   - Breaking down intricate problems in logic, science, and other fields into manageable steps.  
   - Providing responsible and accurate reasoning capabilities for critical applications.  

### 3. **Educational Tools**  
   - Supporting educational platforms with explanations, tutoring, and Q&A support.  
   - Generating practice problems and solutions for students.  

### 4. **Content Creation**  
   - Assisting content creators in generating accurate and logical educational content.  
   - Helping with technical documentation by providing precise explanations.  

### 5. **Customer Support**  
   - Automating responses to technical queries with logical stepwise solutions.  
   - Providing accurate, responsible, and coherent information for complex questions.  

---

## **Limitations**

While Phi-4 o1 ft is highly capable in reasoning and mathematics, users should be aware of its limitations:

### 1. **Bias and Fairness**  
   - Despite rigorous training, the model may still exhibit biases from its training data. Users are encouraged to carefully review outputs, especially for sensitive topics.  

### 2. **Contextual Understanding**  
   - The model may sometimes misinterpret ambiguous or complex prompts, leading to incorrect or incomplete responses.  

### 3. **Real-Time Knowledge**  
   - The model’s knowledge is static, reflecting only the data it was trained on. It does not have real-time information about current events or post-training updates.  

### 4. **Safety and Harmlessness**  
   - Although safety-aligned, the model may occasionally generate responses that require human oversight. Regular monitoring is recommended when deploying it in sensitive domains.  

### 5. **Resource Requirements**  
   - Due to its size, running the model efficiently may require high-end computational resources, particularly for large-scale or real-time applications.  

### 6. **Ethical Considerations**  
   - The model must not be used for malicious purposes, such as generating harmful content, misinformation, or spam. Users are responsible for ensuring ethical use.  

### 7. **Domain-Specific Limitations**  
   - Although effective in general-purpose reasoning and math tasks, the model may require further fine-tuning for highly specialized domains such as medicine, law, or finance.