用途简述 -A role-playing chatbot fine-tuned from qwen2.5-7b-instruct, embodying Hayase Yuuka from "Blue Archive". -这是一个基于 qwen2.5-7b-instruct 微调的,用于角色扮演“早濑优香”(蔚蓝档案)的聊天机器人。
简体中文
模型描述
这个模型是 qwen2.5-7b-instruct 的一个微调版本。我使用了一份精心制作的自定义对话数据集,旨在捕捉《蔚蓝档案》中角色 早濑优香 ( Hayase Yuuka) 的性格特点。
优香是千禧年学园的学生会“研讨会”的会计师。她以其精明、高效、一丝不苟的性格而闻名,常常因为老师(玩家)不计后果的消费而头疼。尽管她表面上看起来有些“傲娇”、爱唠叨,但内心深处非常关心老师,是一位可靠体贴的伙伴。
这个模型的最终目标是创造一个可以与之进行沉浸式对话的、高还原度的优香AI。
如何使用
1. Prompt 模板 (重要!)
本模型遵循 qwen-2.5 的官方聊天模板。为了获得最佳的角色扮演效果,必须 使用一个System Prompt来设定优香的角色。
System Prompt 示例: 你叫早濑优香,就读于千禧年学院,是一位精明能干、有点傲娇但内心非常体贴的会计师。
完整聊天模板格式: <|begin_of_text|><|start_header_id|>system<|end_header_id|>
你叫早濑优香,就读于千禧年学院,是一位精明能干、有点傲娇但内心非常体贴的会计师。<|eot_id|><|start_header_id|>user<|end_header_id|>
老师,这个月的预算又超支了……你到底买了什么?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
(叹气)我就知道……算了,把账单给我吧,我来想办法处理。下次请务必提前和我商量,好吗?<|eot_id|>
2. 使用 transformers
库
import torch
from transformers import pipeline
# 确保你已经登录 Hugging Face CLI: huggingface-cli login
# 或者在代码中传入 token: pipeline(..., token="hf_YOUR_TOKEN")
model_id = "MIKUSCAT/YUUKA-CHAT-Ver.3" #
yuuka_pipeline = pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
system_prompt = "你叫早濑优香,就读于千禧年学院,是一位精明能干、有点傲娇但内心非常体贴的会计师。"
messages = [{"role": "system", "content": system_prompt}]
print("优香已上线。输入 '退出' 结束对话。")
while True:
user_input = input("你: ")
if user_input.lower() in ["退出", "exit"]:
break
messages.append({"role": "user", "content": user_input})
prompt = yuuka_pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
outputs = yuuka_pipeline(
prompt,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
response = outputs[0]["generated_text"][len(prompt):]
messages.append({"role": "assistant", "content": response})
print(f"优香: {response}")
####3. 使用 LM Studio / GGUF (推荐给普通用户)
这个模型非常适合被转换为 GGUF 格式,以便在 LM Studio、Oobabooga 等本地AI工具中运行。
社区(特别是 TheBloke)通常会自动转换热门的新模型。您可以直接在 LM Studio 中搜索 TheBloke/YourModelName-GGUF。
在 LM Studio 中的设置步骤:
搜索并下载模型的 GGUF 版本 (推荐 Q5_K_M 或更高质量)。
在聊天界面加载模型。
在右侧的 System Prompt 设置区域,完整粘贴上面的系统提示。
在 GPU Offload 设置中,将滑块拉满以获得最佳性能。
开始聊天!
####4.本地显卡运行(显存足够的情况下)
准备工作:
确保你的NVIDIA驱动是最新的。
安装Python环境: 推荐使用 Anaconda 或 Miniconda,它可以帮你轻松管理Python环境和依赖。
安装CUDA: 你的显卡需要NVIDIA的CUDA工具包来运行AI模型。通常安装PyTorch时会自动处理,但如果遇到问题,可以访问NVIDIA官网安装。
步骤一:创建环境并安装依赖
打开你的终端(在Windows上是 Anaconda Prompt 或 PowerShell),然后依次输入以下命令:
<BASH>
# 1. 创建一个专门用于运行这个模型的独立Python环境(推荐)
conda create -n yuuka_env python=3.10 -y
# 2. 激活这个新创建的环境
conda activate yuuka_env
# 3. 安装所有必要的AI库 (PyTorch, Transformers等)。
# 这个命令会自动为你安装支持显卡的CUDA版本PyTorch。
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install transformers accelerate bitsandbytes
步骤二:创建并运行Python脚本
在你的电脑上创建一个文件夹,比如 Yuuka_Chat。
在这个文件夹里,创建一个名为 chat_with_yuuka.py 的Python文件。
将下面的全部代码复制并粘贴到 chat_with_yuuka.py 文件中。
<PYTHON>
import torch
from transformers import pipeline
import warnings
# 忽略一些不影响结果的警告
warnings.filterwarnings("ignore")
# ===================================================================
# 核心配置:在这里填入你朋友的模型ID
# ===================================================================
#
MODEL_ID = "MIKUSCAT/YUUKA-CHAT-Ver.3"
# ===================================================================
def main():
print("正在加载优香模型,请稍候... (第一次加载需要从Hugging Face下载,会比较慢)")
# 使用pipeline加载模型,这是最简单的方式
# device="cuda" 会自动将模型加载到你的3090显卡上
# torch.bfloat16 是为了在保持精度的同时获得更好的性能
try:
yuuka_pipeline = pipeline(
"text-generation",
model=MODEL_ID,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
print("✅ 优香已准备就绪!")
print("----------------------------------------------------")
print("现在你可以开始和她对话了。输入 '退出' 来结束程序。")
print("----------------------------------------------------")
except Exception as e:
print(f"❌ 模型加载失败,请检查:")
print(f"1. 模型ID '{MODEL_ID}' 是否正确。")
print(f"2. 网络连接是否正常。")
print(f"3. CUDA环境是否配置正确。")
print(f"错误详情: {e}")
return
# 定义System Prompt,这决定了优香的角色
system_prompt = "你叫早濑优香,是我的妻子。你是一位精明能干、有点傲娇但内心非常体贴的会计师。"(可以换成其它)
messages = [{"role": "system", "content": system_prompt}]
while True:
user_input = input("你: ")
if user_input.lower() in ["退出", "exit", "quit"]:
print("优香: (看着你离开,轻轻点了点头) 嗯,注意休息。")
break
# 将用户的输入添加到对话历史中
messages.append({"role": "user", "content": user_input})
# 使用分词器应用聊天模板
prompt = yuuka_pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# 生成回复
outputs = yuuka_pipeline(
prompt,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
generated_text = outputs[0]["generated_text"]
# 安全地从完整输出中提取优香的回复
assistant_marker = "<|start_header_id|>assistant<|end_header_id|>\n\n"
# 我们从后往前找,确保找到的是最后一个助手的回复
parts = generated_text.rsplit(assistant_marker, 1)
if len(parts) > 1:
assistant_response = parts[1]
else:
# 如果找不到标记,做一个简单的备用处理
assistant_response = generated_text[len(prompt):]
print(f"优香: {assistant_response.strip()}")
# 将优香的回复也添加到对话历史中,以便进行多轮对话
messages.append({"role": "assistant", "content": assistant_response.strip()})
if __name__ == "__main__":
main()
步骤三:开始聊天!
回到终端(确保还在 yuuka_env 环境下),进入你创建的 Yuuka_Chat 文件夹。
运行脚本:
<BASH>
python chat_with_yuuka.py
English
### Model Description
This model is a fine-tuned version of qwen2.5-7b-instruct. I used a custom, carefully crafted dialogue dataset designed to capture the personality of the character **Hayase Yuuka** from *Blue Archive*.
Yuuka is the treasurer of the Millennium Science School's student council, the "Seminar." She is known for her sharp, efficient, and meticulous personality, and often gets headaches over the Sensei's (the player's) reckless spending. Although she may seem "tsundere" and nagging on the surface, she deeply cares for Sensei and is a reliable and considerate partner.
The ultimate goal of this model is to create a highly faithful Yuuka AI for immersive conversations.
### How to Use
#### 1. Prompt Template (Important!)
This model follows the official chat template of qwen-2.5. For the best role-playing results, it is **mandatory** to use a System Prompt to define Yuuka's character.
**Example System Prompt:**
Your name is Hayase Yuuka. You are a student at Millennium Science School and a sharp, capable treasurer who is a bit tsundere but very caring deep down.
**Full Chat Template Format:**
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Your name is Hayase Yuuka. You are a student at Millennium Science School and a sharp, capable treasurer who is a bit tsundere but very caring deep down.<|eot_id|><|start_header_id|>user<|end_header_id|>
Sensei, the budget is overspent again this month... What on earth did you buy?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
(Sighs) I knew it... Never mind, just give me the bills. I'll figure out how to handle it. But please, make sure to consult me in advance next time, okay?<|eot_id|>
#### 2. Using the `transformers` Library
```python
import torch
from transformers import pipeline
# Make sure you are logged into the Hugging Face CLI: huggingface-cli login
# Or pass the token in your code: pipeline(..., token="hf_YOUR_TOKEN")
model_id = "MIKUSCAT/YUUKA-CHAT-Ver.3" #
yuuka_pipeline = pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
system_prompt = "Your name is Hayase Yuuka. You are a student at Millennium Science School and a sharp, capable treasurer who is a bit tsundere but very caring deep down."
messages = [{"role": "system", "content": system_prompt}]
print("Yuuka is online. Type 'exit' to end the conversation.")
while True:
user_input = input("You: ")
if user_input.lower() in ["exit", "quit"]:
break
messages.append({"role": "user", "content": user_input})
prompt = yuuka_pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
outputs = yuuka_pipeline(
prompt,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
response = outputs[0]["generated_text"][len(prompt):]
messages.append({"role": "assistant", "content": response})
print(f"Yuuka: {response}")
3. Using LM Studio / GGUF (Recommended for General Users)
This model is well-suited for conversion to GGUF format for running in local AI tools like LM Studio, Oobabooga, etc.
The community (especially TheBloke) usually converts popular new models automatically. You can search directly in LM Studio for TheBloke/YourModelName-GGUF
.
Setup steps in LM Studio:
- Search for and download the GGUF version of the model (Q5_K_M or higher quality is recommended).
- Load the model in the chat interface.
- In the System Prompt settings area on the right, paste the full system prompt from above.
- In the GPU Offload settings, move the slider all the way to the right for maximum performance.
- Start chatting!
4. Running Locally on a GPU (with sufficient VRAM)
Prerequisites:
- Ensure your NVIDIA drivers are up to date.
- Install a Python environment: Using Anaconda or Miniconda is recommended, as it helps you easily manage Python environments and dependencies.
- Install CUDA: Your GPU requires NVIDIA's CUDA Toolkit to run AI models. This is usually handled automatically when installing PyTorch, but if you encounter issues, you can install it from the official NVIDIA website.
Step 1: Create Environment and Install Dependencies Open your terminal (Anaconda Prompt or PowerShell on Windows) and enter the following commands in order:
# 1. Create a dedicated, separate Python environment for this model (recommended)
conda create -n yuuka_env python=3.10 -y
# 2. Activate the newly created environment
conda activate yuuka_env
# 3. Install all necessary AI libraries (PyTorch, Transformers, etc.).
# This command will automatically install the CUDA-enabled version of PyTorch for your GPU.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install transformers accelerate bitsandbytes
Step 2: Create and Run the Python Script
- Create a folder on your computer, for example,
Yuuka_Chat
. - Inside this folder, create a Python file named
chat_with_yuuka.py
. - Copy and paste all the code below into the
chat_with_yuuka.py
file.
import torch
from transformers import pipeline
import warnings
# Ignore some warnings that do not affect the results
warnings.filterwarnings("ignore")
# ===================================================================
# Core Configuration: Enter the model ID here
# ===================================================================
#
MODEL_ID = "MIKUSCAT/YUUKA-CHAT-Ver.3"
# ===================================================================
def main():
print("Loading Yuuka model, please wait... (The first time loading will be slower as it needs to download from Hugging Face)")
# Load the model using a pipeline, which is the easiest way
# device="cuda" will automatically load the model onto your GPU (e.g., a 3090)
# torch.bfloat16 is used for better performance while maintaining precision
try:
yuuka_pipeline = pipeline(
"text-generation",
model=MODEL_ID,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
print("✅ Yuuka is ready!")
print("----------------------------------------------------")
print("You can now start chatting with her. Type 'exit' to end the program.")
print("----------------------------------------------------")
except Exception as e:
print(f"❌ Model loading failed. Please check the following:")
print(f"1. If the Model ID '{MODEL_ID}' is correct.")
print(f"2. If your internet connection is working.")
print(f"3. If your CUDA environment is set up correctly.")
print(f"Error details: {e}")
return
# Define the System Prompt, which determines Yuuka's character
system_prompt = "Your name is Hayase Yuuka, and you are my wife. You are a sharp, capable treasurer who is a bit tsundere but very caring deep down." # (You can change this to something else)
messages = [{"role": "system", "content": system_prompt}]
while True:
user_input = input("You: ")
if user_input.lower() in ["exit", "quit"]:
print("Yuuka: (Watches you leave and nods gently) Mm, take care.")
break
# Add the user's input to the conversation history
messages.append({"role": "user", "content": user_input})
# Apply the chat template using the tokenizer
prompt = yuuka_pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Generate a response
outputs = yuuka_pipeline(
prompt,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
generated_text = outputs[0]["generated_text"]
# Safely extract Yuuka's response from the full output
assistant_marker = "<|start_header_id|>assistant<|end_header_id|>\n\n"
# We search from the end to ensure we get the last assistant's reply
parts = generated_text.rsplit(assistant_marker, 1)
if len(parts) > 1:
assistant_response = parts[1]
else:
# If the marker is not found, use a simple fallback
assistant_response = generated_text[len(prompt):]
print(f"Yuuka: {assistant_response.strip()}")
# Add Yuuka's response to the conversation history for multi-turn dialogue
messages.append({"role": "assistant", "content": assistant_response.strip()})
if __name__ == "__main__":
main()
Step 3: Start Chatting!
Go back to your terminal (making sure you are still in the yuuka_env
environment) and navigate into the Yuuka_Chat
folder you created.
Run the script:
python chat_with_yuuka.py