apple
/

sage-ft-mixtral-8x7b

@@ -17,80 +17,151 @@ metrics:
 # SAGE Dialogue Gen 🌱
-**Authors**: Yizhe Zhang, Navdeep Jaitly (Apple)
-**Citation**:
-📄 [Read the paper on arXiv]  [oai_citation:0‡huggingface.co](https://huggingface.co/docs/huggingface_hub/guides/model-cards?utm_source=chatgpt.com) [oai_citation:1‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com)
 ---
-## Model Description
-SAGE (Steering And Refining GEneration) is a dialogue generation model that introduces **latent state-action variables** between turns, enabling:
-- **Structured control** over emotional tone and conversational strategy
-- **Enhanced emotional intelligence** through coarse-grained state planning
-- A **self-improving pipeline** that includes data augmentation, dialogue-tree search, LLM-derived reward modeling, and targeted fine-tuning  [oai_citation:2‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com) [oai_citation:3‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com)
 ---
-## Intended Uses & Limitations
-### ✅ Suitable For
-- Emotional / empathetic chatbots
-- Long-horizon, strategy-based dialogue systems
-- Research into structured latent-variable control in LLMs
-### ⚠️ Not Recommended For
-- Tasks outside open-domain or emotionally-aware dialogues
-- Use in high-stakes or sensitive environments without further bias/security evaluation
 ---
 ## Training Details
-- **Base model**: fine-tuned from Mixtral‑8x7B‑Instruct  [oai_citation:4‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com) [oai_citation:5‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com)
-- **Training pipeline**:
-  1. Data prep (ShareGPT-style JSON)
-  2. Supervised fine-tuning (SFT)
-  3. Dialogue tree search
-  4. Preference learning
-  5. Model comparison & inference  [oai_citation:6‡github.com](https://github.com/apple/ml-sage-dialog-gen?utm_source=chatgpt.com)
 ---
-## Evaluation Results
-Improved performance on emotional-intelligence metrics compared to baselines, while maintaining generative quality on standard LLM benchmarks  [oai_citation:7‡arxiv.org](https://arxiv.org/pdf/2503.03040?utm_source=chatgpt.com).
 ---
-## How to Use
-```bash
-# Clone the model
-git lfs install
-git clone https://huggingface.co/your-username/sage-dialog-gen
-# Install requirements & dependencies
 bash setup.sh
-# Load with 🤗 Transformers
-from transformers import AutoModelForCausalLM, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained("your-username/sage-dialog-gen")
-model = AutoModelForCausalLM.from_pretrained("your-username/sage-dialog-gen")
-# Inference example
-inputs = tokenizer("Hi! How are you?", return_tensors="pt")
-out = model.generate(**inputs)
-print(tokenizer.decode(out[0]))

 # SAGE Dialogue Gen 🌱
+**Authors**: Yizhe Zhang, Navdeep Jaitly (Apple)
+---
+## Model Information
+- **Language**: English
+- **License**: Apache 2.0
+- **Base Model**: mistralai/Mixtral-8x7B-Instruct-v0.1
+- **Library**: transformers
+- **Tags**: dialog-generation, conversational-ai, state-action-model
+- **Dataset**: ShareGPT
+- **Metrics**: Custom emotional-intelligence evaluation
 ---
+## Citation
+```bibtex
+@misc{zhang2025sage,
+  title = {SAGE: Steering and Refining Dialogue Generation with State‑Action Augmentation},
+  author = {Zhang, Yizhe and Jaitly, Navdeep},
+  year = {2025},
+  howpublished = {arXiv preprint},
+  note = {arXiv:2503.03040}
+}
+```
+📄 **Paper**: Available on arXiv and Papers with Code
 ---
+## Model Description
+SAGE introduces **latent state-action variables** between dialogue turns, enabling:
+- **Structured Control**: Precise management of emotional tone and conversational strategy
+- **Enhanced Emotional Intelligence**: Explicit state planning for more empathetic responses
+- **Self-Improving Pipeline**: Comprehensive training approach including:
+  - Data augmentation
+  - Dialogue-tree search
+  - Reward modeling
+  - Fine-tuning optimization
+This approach allows for more nuanced and contextually appropriate dialogue generation compared to traditional methods.
+---
+## Intended Uses
+### ✅ **Recommended Applications**
+- Emotional or empathetic chatbots
+- Long-horizon, strategy-aware conversation systems
+- Research on structured latent-variable dialogue control
+- Educational conversational AI systems
+- Customer service applications requiring emotional intelligence
+### ⚠️ **Important Limitations**
+- **Not suitable** for high-stakes, safety-critical deployment without further evaluation
+- Requires additional testing for production environments
+- May need domain-specific fine-tuning for specialized applications
 ---
 ## Training Details
+**Base Model**: Mixtral-8x7B-Instruct
+**Training Pipeline**:
+1. **Data Preparation**: ShareGPT-style JSON formatting
+2. **Supervised Fine-Tuning (SFT)**: Initial model adaptation
+3. **Dialogue-Tree Search**: Exploration of conversation paths
+4. **Preference Learning**: Reward model training
+5. **Comparative Evaluation**: Performance assessment and inference optimization
 ---
+## Performance
+SAGE demonstrates significant improvements on emotional-intelligence metrics compared to baseline models while maintaining generative flexibility and coherence. The model shows particular strength in:
+- Emotional tone consistency
+- Contextual appropriateness
+- Long-term conversation planning
+- Empathetic response generation
 ---
+## Usage
+### Quick Start
+```bash
+git clone https://github.com/apple/ml-sage-dialog-gen
+cd ml-sage-dialog-gen
 bash setup.sh
+```
+### Basic Implementation
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load the model
+tokenizer = AutoTokenizer.from_pretrained("apple/sage-dialogue-gen")
+model = AutoModelForCausalLM.from_pretrained("apple/sage-dialogue-gen")
+# Generate dialogue
+input_text = "I'm feeling overwhelmed with work lately."
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=150, do_sample=True, temperature=0.7)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+```
+---
+## Requirements
+- Python 3.8+
+- PyTorch 1.12+
+- Transformers 4.21+
+- Additional dependencies listed in `requirements.txt`
+---
+## Contributing
+Contributions are welcome! Please see our contributing guidelines and code of conduct before submitting pull requests.
+---
+## License
+This project is licensed under the Apache License 2.0. See the LICENSE file for details.
+---
+## Acknowledgments
+- Built upon the Mixtral-8x7B-Instruct foundation model
+- Trained using ShareGPT dataset
+- Developed by the Apple Machine Learning Research team
+---
+## Contact
+For questions or issues, please open a GitHub issue or contact the development team through the official Apple ML research channels.