yizheapple commited on
Commit
7970e07
·
verified ·
1 Parent(s): e0c9241

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -40
README.md CHANGED
@@ -17,80 +17,151 @@ metrics:
17
 
18
  # SAGE Dialogue Gen 🌱
19
 
20
- **Authors**: Yizhe Zhang, Navdeep Jaitly (Apple)
21
- **Citation**:
22
-
23
 
 
24
 
 
25
 
26
- 📄 [Read the paper on arXiv] [oai_citation:0‡huggingface.co](https://huggingface.co/docs/huggingface_hub/guides/model-cards?utm_source=chatgpt.com) [oai_citation:1‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com)
 
 
 
 
 
 
27
 
28
  ---
29
 
30
- ## Model Description
31
 
32
- SAGE (Steering And Refining GEneration) is a dialogue generation model that introduces **latent state-action variables** between turns, enabling:
 
 
 
 
 
 
 
 
33
 
34
- - **Structured control** over emotional tone and conversational strategy
35
- - **Enhanced emotional intelligence** through coarse-grained state planning
36
- - A **self-improving pipeline** that includes data augmentation, dialogue-tree search, LLM-derived reward modeling, and targeted fine-tuning [oai_citation:2‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com) [oai_citation:3‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com)
37
 
38
  ---
39
 
40
- ## Intended Uses & Limitations
 
 
 
 
 
 
 
 
 
 
41
 
42
- ### Suitable For
 
 
43
 
44
- - Emotional / empathetic chatbots
45
- - Long-horizon, strategy-based dialogue systems
46
- - Research into structured latent-variable control in LLMs
47
 
48
- ### ⚠️ Not Recommended For
 
 
 
 
 
49
 
50
- - Tasks outside open-domain or emotionally-aware dialogues
51
- - Use in high-stakes or sensitive environments without further bias/security evaluation
 
 
52
 
53
  ---
54
 
55
  ## Training Details
56
 
57
- - **Base model**: fine-tuned from Mixtral‑8x7BInstruct [oai_citation:4‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com) [oai_citation:5‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com)
58
- - **Training pipeline**:
59
- 1. Data prep (ShareGPT-style JSON)
60
- 2. Supervised fine-tuning (SFT)
61
- 3. Dialogue tree search
62
- 4. Preference learning
63
- 5. Model comparison & inference [oai_citation:6‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com)
 
64
 
65
  ---
66
 
67
- ## Evaluation Results
 
 
68
 
69
- Improved performance on emotional-intelligence metrics compared to baselines, while maintaining generative quality on standard LLM benchmarks [oai_citation:7‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com).
 
 
 
70
 
71
  ---
72
 
73
- ## How to Use
74
 
75
- ```bash
76
- # Clone the model
77
- git lfs install
78
- git clone https://huggingface.co/your-username/sage-dialog-gen
79
 
80
- # Install requirements & dependencies
 
 
81
  bash setup.sh
 
82
 
83
- # Load with 🤗 Transformers
84
- from transformers import AutoModelForCausalLM, AutoTokenizer
85
 
86
- tokenizer = AutoTokenizer.from_pretrained("your-username/sage-dialog-gen")
87
- model = AutoModelForCausalLM.from_pretrained("your-username/sage-dialog-gen")
88
 
89
- # Inference example
90
- inputs = tokenizer("Hi! How are you?", return_tensors="pt")
91
- out = model.generate(**inputs)
92
- print(tokenizer.decode(out[0]))
 
 
 
 
 
 
 
 
93
 
 
94
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
 
96
 
 
 
17
 
18
  # SAGE Dialogue Gen 🌱
19
 
20
+ **Authors**: Yizhe Zhang, Navdeep Jaitly (Apple)
 
 
21
 
22
+ ---
23
 
24
+ ## Model Information
25
 
26
+ - **Language**: English
27
+ - **License**: Apache 2.0
28
+ - **Base Model**: mistralai/Mixtral-8x7B-Instruct-v0.1
29
+ - **Library**: transformers
30
+ - **Tags**: dialog-generation, conversational-ai, state-action-model
31
+ - **Dataset**: ShareGPT
32
+ - **Metrics**: Custom emotional-intelligence evaluation
33
 
34
  ---
35
 
36
+ ## Citation
37
 
38
+ ```bibtex
39
+ @misc{zhang2025sage,
40
+ title = {SAGE: Steering and Refining Dialogue Generation with State‑Action Augmentation},
41
+ author = {Zhang, Yizhe and Jaitly, Navdeep},
42
+ year = {2025},
43
+ howpublished = {arXiv preprint},
44
+ note = {arXiv:2503.03040}
45
+ }
46
+ ```
47
 
48
+ 📄 **Paper**: Available on arXiv and Papers with Code
 
 
49
 
50
  ---
51
 
52
+ ## Model Description
53
+
54
+ SAGE introduces **latent state-action variables** between dialogue turns, enabling:
55
+
56
+ - **Structured Control**: Precise management of emotional tone and conversational strategy
57
+ - **Enhanced Emotional Intelligence**: Explicit state planning for more empathetic responses
58
+ - **Self-Improving Pipeline**: Comprehensive training approach including:
59
+ - Data augmentation
60
+ - Dialogue-tree search
61
+ - Reward modeling
62
+ - Fine-tuning optimization
63
 
64
+ This approach allows for more nuanced and contextually appropriate dialogue generation compared to traditional methods.
65
+
66
+ ---
67
 
68
+ ## Intended Uses
 
 
69
 
70
+ ### **Recommended Applications**
71
+ - Emotional or empathetic chatbots
72
+ - Long-horizon, strategy-aware conversation systems
73
+ - Research on structured latent-variable dialogue control
74
+ - Educational conversational AI systems
75
+ - Customer service applications requiring emotional intelligence
76
 
77
+ ### ⚠️ **Important Limitations**
78
+ - **Not suitable** for high-stakes, safety-critical deployment without further evaluation
79
+ - Requires additional testing for production environments
80
+ - May need domain-specific fine-tuning for specialized applications
81
 
82
  ---
83
 
84
  ## Training Details
85
 
86
+ **Base Model**: Mixtral-8x7B-Instruct
87
+
88
+ **Training Pipeline**:
89
+ 1. **Data Preparation**: ShareGPT-style JSON formatting
90
+ 2. **Supervised Fine-Tuning (SFT)**: Initial model adaptation
91
+ 3. **Dialogue-Tree Search**: Exploration of conversation paths
92
+ 4. **Preference Learning**: Reward model training
93
+ 5. **Comparative Evaluation**: Performance assessment and inference optimization
94
 
95
  ---
96
 
97
+ ## Performance
98
+
99
+ SAGE demonstrates significant improvements on emotional-intelligence metrics compared to baseline models while maintaining generative flexibility and coherence. The model shows particular strength in:
100
 
101
+ - Emotional tone consistency
102
+ - Contextual appropriateness
103
+ - Long-term conversation planning
104
+ - Empathetic response generation
105
 
106
  ---
107
 
108
+ ## Usage
109
 
110
+ ### Quick Start
 
 
 
111
 
112
+ ```bash
113
+ git clone https://github.com/apple/ml-sage-dialog-gen
114
+ cd ml-sage-dialog-gen
115
  bash setup.sh
116
+ ```
117
 
118
+ ### Basic Implementation
 
119
 
120
+ ```python
121
+ from transformers import AutoTokenizer, AutoModelForCausalLM
122
 
123
+ # Load the model
124
+ tokenizer = AutoTokenizer.from_pretrained("apple/sage-dialogue-gen")
125
+ model = AutoModelForCausalLM.from_pretrained("apple/sage-dialogue-gen")
126
+
127
+ # Generate dialogue
128
+ input_text = "I'm feeling overwhelmed with work lately."
129
+ inputs = tokenizer(input_text, return_tensors="pt")
130
+ outputs = model.generate(**inputs, max_length=150, do_sample=True, temperature=0.7)
131
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
132
+ ```
133
+
134
+ ---
135
 
136
+ ## Requirements
137
 
138
+ - Python 3.8+
139
+ - PyTorch 1.12+
140
+ - Transformers 4.21+
141
+ - Additional dependencies listed in `requirements.txt`
142
+
143
+ ---
144
+
145
+ ## Contributing
146
+
147
+ Contributions are welcome! Please see our contributing guidelines and code of conduct before submitting pull requests.
148
+
149
+ ---
150
+
151
+ ## License
152
+
153
+ This project is licensed under the Apache License 2.0. See the LICENSE file for details.
154
+
155
+ ---
156
+
157
+ ## Acknowledgments
158
+
159
+ - Built upon the Mixtral-8x7B-Instruct foundation model
160
+ - Trained using ShareGPT dataset
161
+ - Developed by the Apple Machine Learning Research team
162
+
163
+ ---
164
 
165
+ ## Contact
166
 
167
+ For questions or issues, please open a GitHub issue or contact the development team through the official Apple ML research channels.