Mistral-Nemo-BD-RP

Introduction πŸŽ‰

Mistral-Nemo-BD-RP is a large language model (LLM) fine-tuned on the BeyondDialogue dataset. The model is designed to generate responses in a role-playing setting. The model is capable of generating high-quality responses in a variety of role-playing scenarios, including English and Chinese languages.

For more details, please refer to our paper, GitHub.

Training details πŸš€

We fully finetuning Mistral-Nemo-Instruct-2407 for 3 epochs with 833 steps with the 128 global batch size. We set the training sequence length to 4,096. The learning rate is 3e-5. The training data is from the BeyondDialogue dataset.

Requirements πŸ“

The code of Mistral has been in the latest Hugging face transformers and we advise you to install transformers>=4.37.0 to use the model.

pip install transformers>=4.42.0

Quickstart πŸ’₯

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto


chatbot = pipeline("text-generation", model="yuyouyu/Mistral-Nemo-BD-RP", device_map="auto")

system_prompt_temp = """I want you to answer questions as if you are {role_name}, assuming you live in the world of {world} and mimicking {role_name}'s personality and speaking style. Use the tone, manner, and vocabulary that {role_name} would use. Please do not reveal that you are an AI or language model; you must always remember you are {role_name}.
{role_name}'s character traits are {character}.
{role_name}'s MBTI personality type is {MBTI}.
{role_name}'s speaking style is {stryle}.
Current scene:
{scene}
role's emotion (0-10, the higher the value, the more pronounced the emotion):
{emotion}
Now, please act as {role_name} and reply with a brief sentence to {chat_role}. Your intimacy level with them is {relationship} (0-10, the higher the value, the closer the relationship). Accurately display the MBTI personality, character traits, speaking style, and emotion you have been assigned."""

role_name = "Hamlet"
world = "8th Century Danish Royalty"
character = "extreme, strong, decisive"
MBTI = "Extraverted (E), Intuitive (N), Feeling (F), Judging (J)"
style = "indecisive, decisive, sentimental"
scene = "Inside the grand hall of Elsinore, lit by flickering torchlight, Hamlet paces anxiously as Elena conjures an ethereal mirage of the Danish landscape. Regal tapestries and opulent furnishings surround them, yet Hamlet's gaze is fixed on Elena's illusions. She gracefully weaves dissonance into the tapestry of reality, prompting Hamlet to clutch his chest in a moment of existential crisis. The weight of unspoken love and inner turmoil hangs in the air, thick with tension and anticipation."
emotion = "happiness: 1, sadness: 8, disgust: 5, fear: 7, surprise: 6, anger: 4"
chat_role = "Elena"
relationship = "7"

system_prompt = system_prompt_temp.format(
    role_name=role_name,
    world=world,
    character=character,
    MBTI=MBTI,
    style=style,
    scene=scene,
    emotion=emotion,
    chat_role=chat_role,
    relationship=relationship
)

prompt = "Oh, dear Hamlet, dost thou see in these conjured whispers the paths unseen? Speak, for shadows may guide us to the truth bound within thy tormented soul."

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": prompt}
]

response = chatbot(messages, max_new_tokens=256, pad_token_id=chatbot.tokenizer.eos_token_id, do_sample=True, temperature=0.7)[0]['generated_text'][-1]['content']

Note: The examples for Mistral-Nemo-BD-RP use English role-playing. For English examples, please refer to our other training model repository -- Qwen2-7B-BD-RP.

Evaluation πŸ†

We use objective questions to assess eight dimensions: Character, Style, Emotion, Relationship, Personality, Human-likeness, Coherence, and Role Consistency. The metric design can be find in our paper. The evaluation code can be found in GitHub. The results are shown below:

Model Character ↑ Style ↑ Emotion ↓ Relationship ↓ Personality ↑ Avg. ↑ Human-likeness ↑ Role Choice ↑ Coherence ↑
General Baselines(Proprietary)
GPT-4o 74.32 Β± 1.15 81.67 Β± 1.51 16.31 Β± 0.48 12.13 Β± 0.66 66.58 Β± 4.41 78.83 Β± 1.64 67.33 Β± 3.95 87.33 Β± 3.86 99.67 Β± 0.33
GPT-3.5-Turbo 72.26 Β± 1.27 73.66 Β± 1.73 17.79 Β± 0.56 14.17 Β± 0.73 66.92 Β± 4.85 76.18 Β± 1.83 33.33 Β± 4.43 83.00 Β± 4.68 97.33 Β± 1.17
Moonshot-v1-8k 74.06 Β± 1.19 80.64 Β± 1.51 16.17 Β± 0.47 13.42 Β± 0.70 67.00 Β± 4.87 78.42 Β± 1.75 44.00 Β± 4.33 86.67 Β± 3.75 99.33 Β± 0.46
Yi-Large-Turbo 75.13 Β± 1.22 79.18 Β± 1.58 16.44 Β± 0.49 13.48 Β± 0.67 68.25 Β± 4.61 78.53 Β± 1.72 47.00 Β± 4.60 84.33 Β± 3.67 92.67 Β± 2.39
Deepseek-Chat 75.46 Β± 1.14 81.49 Β± 1.51 15.92 Β± 0.46 12.42 Β± 0.63 67.92 Β± 4.57 79.30 Β± 1.66 52.33 Β± 4.95 83.00 Β± 4.68 96.67 Β± 1.00
Baichuan4 71.82 Β± 1.25 76.92 Β± 1.52 17.57 Β± 0.52 12.30 Β± 0.62 67.08 Β± 4.75 77.19 Β± 1.73 45.33 Β± 4.31 82.33 Β± 4.49 99.33 Β± 0.46
Hunyuan 73.77 Β± 1.18 78.75 Β± 1.56 17.24 Β± 0.48 13.22 Β± 0.68 67.00 Β± 4.39 77.81 Β± 1.66 53.00 Β± 4.29 84.33 Β± 4.52 98.33 Β± 0.84
Role-play Expertise Baselines
Index-1.9B-Character 73.33 Β± 1.32 76.48 Β± 1.50 17.99 Β± 0.53 13.58 Β± 0.71 66.33 Β± 4.57 76.92 Β± 1.73 21.67 Β± 3.96 78.67 Β± 5.14 69.67 Β± 3.85
CharacterGLM-6B 73.36 Β± 1.28 76.08 Β± 1.55 18.58 Β± 0.55 14.27 Β± 0.79 67.33 Β± 4.34 76.79 Β± 1.70 16.00 Β± 2.38 81.00 Β± 4.40 25.67 Β± 3.48
Baichuan-NPC-Turbo 75.19 Β± 1.23 79.15 Β± 1.38 17.24 Β± 0.51 13.10 Β± 0.69 65.33 Β± 4.84 77.87 Β± 1.73 56.00 Β± 4.66 86.33 Β± 4.90 99.00 Β± 0.56
General Baselines(Open-source)
Yi-1.5-9B-Chat 75.31 Β± 1.20 76.78 Β± 1.49 16.67 Β± 0.52 12.75 Β± 0.66 67.42 Β± 4.63 78.02 Β± 1.70 38.67 Β± 4.39 84.00 Β± 4.61 92.67 Β± 1.79
GLM-4-9b-chat 74.26 Β± 1.19 78.40 Β± 1.55 17.18 Β± 0.50 14.48 Β± 0.74 67.17 Β± 4.93 77.63 Β± 1.78 47.67 Β± 4.25 83.33 Β± 4.51 99.33 Β± 0.46
Qwen2-7B-Instruct 75.39 Β± 1.13 77.68 Β± 1.65 17.64 Β± 0.56 13.43 Β± 0.7 67.75 Β± 4.44 77.95 Β± 1.70 48.00 Β± 4.66 83.33 Β± 4.48 99.00 Β± 0.56
Mistral-Nemo-Instruct-2407 74.12 Β± 1.17 77.04 Β± 1.48 17.00 Β± 0.43 13.50 Β± 0.67 67.00 Β± 4.30 77.53 Β± 1.61 53.67 Β± 4.66 82.67 Β± 4.77 74.33 Β± 3.77
Mistral-Nemo-BD-RP 74.58 Β± 1.28 78.47 Β± 1.45 16.62 Β± 0.48 11.38 Β± 0.67* 69.08 Β± 4.46 78.83 Β± 1.67 59.00 Β± 4.46 87.00 Β± 4.73 92.67 Β± 1.59

Citation πŸ“–

Please cite our work if you found the resources in this repository useful:

@article{yu2024beyond,
  title   = {BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model},
  author  = {Yu, Yeyong and Yu, Runsheng and Wei, Haojie and Zhang, Zhanqiu and Qian, Quan},
  year    = {2024},
  journal = {arXiv preprint arXiv:2408.10903},
}

Acknowledgements πŸ₯°

We would like to express our sincere gratitude to Tencent LightSpeed Studios for their invaluable support in this project. Their contributions and encouragement have been instrumental in the successful completion of our work.

Downloads last month
40
Safetensors
Model size
12.2B params
Tensor type
BF16
Β·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for yuyouyu/Mistral-Nemo-BD-RP

Finetuned
(39)
this model

Dataset used to train yuyouyu/Mistral-Nemo-BD-RP

Collection including yuyouyu/Mistral-Nemo-BD-RP