File size: 7,690 Bytes
1d44737
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ae9db16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1d44737
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
license: mit
language:
- en
base_model: prithivMLmods/Phi-4-Empathetic
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
- phi
- phi3
- llama
- human_like_reasoning
- llama-cpp
- gguf-my-repo
---

# Triangle104/Phi-4-Empathetic-Q4_K_M-GGUF
This model was converted to GGUF format from [`prithivMLmods/Phi-4-Empathetic`](https://huggingface.co/prithivMLmods/Phi-4-Empathetic) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-Empathetic) for more details on the model.

---
Model details:
-
[Phi-4 Empathetic finetuned] from Microsoft's Phi-4 is an advanced open model built upon a blend of high-quality synthetic datasets, data from filtered public domain websites, and carefully selected academic resources. It excels at responsible human-like reasoning, empathetic dialogue, and emotional thought generation. The model is designed to engage in nuanced, thoughtful conversations, with outputs that can include special characters and emojis for expressive communication. 🌟

Phi-4 Empathetic employs a sophisticated safety post-training approach, leveraging both open-source and proprietary datasets. Safety alignment is achieved using a combination of SFT (Supervised Fine-Tuning) and DPO (Direct Preference Optimization), targeting responsible interaction and emotional awareness in diverse contexts.
Dataset Info

Phi-4 Empathetic is fine-tuned on a carefully curated dataset tailored for empathetic and responsible reasoning tasks. The dataset incorporates the Chain of Thought (CoT) methodology, emphasizing logical reasoning, emotional nuance, and step-by-step thought processes. Additionally, it includes data optimized for generating responses that resonate with human emotions, making it ideal for:

    Emotional Support Applications πŸ€—
    Responsible Conversations πŸ’¬
    Thoughtful Problem-Solving 🧠

Run with Transformers

# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Phi-4-Empathetic")
model = AutoModelForCausalLM.from_pretrained(
    "prithivMLmods/Phi-4-Empathetic",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

input_text = "Can you share some words of encouragement for someone feeling down?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids, max_new_tokens=32)
print(tokenizer.decode(outputs[0]))

You can ensure correct formatting for empathetic dialogue by using tokenizer.apply_chat_template as follows:

messages = [
    {"role": "user", "content": "Can you share some words of encouragement for someone feeling down?"},
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to("cuda")

outputs = model.generate(**input_ids, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

Intended Use

The Phi-4 Empathetic model is optimized for applications that require thoughtful and emotionally aware interactions. Below are some suggested use cases:

    Emotional Support & Counseling πŸ’–
        Providing thoughtful responses to users seeking emotional encouragement or advice.
        Generating empathetic messages for mental health and well-being applications.

    Responsible Dialogue Generation πŸ—£οΈ
        Engaging in nuanced conversations with a focus on fairness, safety, and ethical considerations.
        Ensuring that interactions remain respectful and aligned with safety guidelines.

    Creative Writing Assistance ✍️
        Helping users craft emotionally engaging content, including stories, poems, and personal messages.
        Assisting in generating content enriched with special characters and emojis for expressive communication.

    Educational Tools πŸŽ“
        Offering step-by-step explanations with an empathetic tone for better understanding.
        Generating thoughtful Q&A responses for various subjects.

    Customer Support 🀝
        Automating empathetic responses to customer queries.
        Handling emotionally sensitive customer service interactions with care.

    Social Media Engagement πŸ“±
        Generating creative, engaging, and emotionally resonant posts for social media platforms.
        Providing personalized message suggestions enriched with emojis and special characters.

Limitations

While Phi-4 Empathetic is highly capable, it has certain limitations users should be aware of:

    Bias and Fairness:
    Despite extensive safety alignment, biases may still emerge in the model’s responses. Users should exercise discretion, particularly in sensitive contexts.

    Emotional Nuance:
    The model may occasionally misinterpret the emotional tone of a prompt, leading to less relevant or inappropriate responses.

    Real-Time Knowledge:
    The model's knowledge is based on the data it was trained on and does not include real-time or post-training updates. It may not reflect recent events or changes in knowledge.

    Safety and Harmlessness:
    Although the model is aligned with safety standards, there may still be cases where outputs require human oversight to ensure appropriateness.

    Resource Requirements:
    Running the model efficiently may require significant computational resources, especially in large-scale or real-time applications.

    Ethical Considerations:
    The model must be used responsibly, avoiding any malicious applications such as generating harmful content or spreading misinformation.

    Domain-Specific Limitations:
    While it performs well in general-purpose tasks, it may need further fine-tuning for highly specialized domains, such as legal, medical, or financial applications.

Special Features

    Emojis & Special Characters πŸŽ‰πŸ’‘
    The model can generate responses with emojis and special characters for expressive communication, making it ideal for social media and personal messaging applications.

    Human-Like Reasoning 🧠
    Fine-tuned for responsible reasoning and empathetic dialogue, it excels at generating thoughtful and human-like responses.

    Advanced Safety Alignment πŸ”’
    The model employs iterative SFT and DPO techniques to ensure that its outputs are helpful, harmless, and aligned with ethical standards.

---
## Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)

```bash
brew install llama.cpp

```
Invoke the llama.cpp server or the CLI.

### CLI:
```bash
llama-cli --hf-repo Triangle104/Phi-4-Empathetic-Q4_K_M-GGUF --hf-file phi-4-empathetic-q4_k_m.gguf -p "The meaning to life and the universe is"
```

### Server:
```bash
llama-server --hf-repo Triangle104/Phi-4-Empathetic-Q4_K_M-GGUF --hf-file phi-4-empathetic-q4_k_m.gguf -c 2048
```

Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.
```
git clone https://github.com/ggerganov/llama.cpp
```

Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
```
cd llama.cpp && LLAMA_CURL=1 make
```

Step 3: Run inference through the main binary.
```
./llama-cli --hf-repo Triangle104/Phi-4-Empathetic-Q4_K_M-GGUF --hf-file phi-4-empathetic-q4_k_m.gguf -p "The meaning to life and the universe is"
```
or 
```
./llama-server --hf-repo Triangle104/Phi-4-Empathetic-Q4_K_M-GGUF --hf-file phi-4-empathetic-q4_k_m.gguf -c 2048
```