🎭 DeepSeek-R1 Roleplay LoRA

A specialized roleplay model fine-tuned for creative storytelling and character interactions

This model is a LoRA (Low-Rank Adaptation) fine-tuned version of unsloth/DeepSeek-R1-0528-Qwen3-8B, specifically optimized for roleplay scenarios and creative character interactions. The model has been trained using advanced GRPO (Generalized Reward Preference Optimization) techniques to improve response quality and maintain character consistency.

🌟 Model Details

🎯 Intended Use

This model is designed for:

  • Creative Writing: Generate engaging narratives and character interactions
  • Roleplay Scenarios: Create immersive character-driven conversations
  • Storytelling: Develop rich, detailed story content
  • Character Development: Maintain consistent character personalities and behaviors

🚀 Quick Start

Loading the Model

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Chun121/RP_DeepSeek-R1-0528-Qwen3-8B",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Enable fast inference
FastLanguageModel.for_inference(model)

Basic Usage

# Prepare your roleplay prompt
messages = [
    {
        "role": "system", 
        "content": "You are a creative roleplay assistant. Engage in character-driven storytelling with rich descriptions and authentic dialogue."
    },
    {
        "role": "user", 
        "content": "You are a wise wizard in a mystical forest. A young adventurer seeks your guidance."
    }
]

# Apply chat template
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

# Generate response
outputs = model.generate(
    input_ids=inputs,
    max_new_tokens=512,
    temperature=0.8,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Advanced Usage with VLLM

from vllm import LLM, SamplingParams

# Load model with LoRA
llm = LLM(
    model="unsloth/DeepSeek-R1-0528-Qwen3-8B",
    enable_lora=True
)

# Configure sampling
sampling_params = SamplingParams(
    temperature=0.8,
    top_p=0.95,
    max_tokens=512,
    stop=["</s>"]
)

# Generate with LoRA adapter
outputs = llm.generate(
    prompts=["Your roleplay prompt here"],
    sampling_params=sampling_params,
    lora_request=("roleplay_lora", "Chun121/RP_DeepSeek-R1-0528-Qwen3-8B", 1)
)

🔧 Training Details

Training Configuration

  • Learning Rate: 5e-6
  • Batch Size: 1 (per device)
  • Gradient Accumulation: 1 step
  • Max Steps: 100
  • Optimizer: AdamW 8-bit
  • Weight Decay: 0.01
  • Warmup Ratio: 0.1
  • Scheduler: Linear

LoRA Configuration

  • Rank (r): 32
  • Alpha: 64 (rank × 2)
  • Target Modules:
    • q_proj, k_proj, v_proj, o_proj
    • gate_proj, up_proj, down_proj
  • Dropout: 0.0

Reward Functions

The model was trained with multiple reward functions to ensure high-quality outputs:

  1. Format Matching: Ensures proper response structure
  2. Language Consistency: Maintains English language output
  3. Answer Quality: Rewards contextually appropriate responses
  4. Character Consistency: Maintains character behavior throughout interactions

📊 Performance

The model demonstrates significant improvements in:

  • English Language Consistency: Enhanced from baseline for English responses
  • Character Authenticity: Better adherence to character descriptions and personalities
  • Response Quality: More engaging and contextually appropriate roleplay responses
  • Format Compliance: Consistent use of proper formatting and structure

🎪 Example Outputs

Input: "You are a mysterious vampire lord in your ancient castle. A brave knight has come to challenge you."

Output:

*I emerge from the shadows of my throne room, my crimson eyes gleaming with ancient wisdom and predatory grace* 

"Ah, another mortal seeks to test their mettle against the lord of this domain..." *My voice echoes through the stone halls as I descend the steps slowly, each footfall deliberate and commanding* "Tell me, brave knight, what drives you to seek audience with one who has witnessed the rise and fall of kingdoms?"

*I pause before you, close enough that you can feel the supernatural chill that emanates from my presence* "Do you come seeking glory? Vengeance? Or perhaps... something far more dangerous?"

⚠️ Important Notes

  • This model is designed for creative and fictional content generation
  • The model has been fine-tuned for roleplay scenarios and may generate mature themes
  • Always use appropriate content warnings when sharing outputs
  • The model works best with clear character descriptions and scenario setup
  • For optimal results, provide context about the setting and character relationships

🛠️ Technical Specifications

  • Architecture: QWen2-based transformer with LoRA adapters
  • Precision: 4-bit quantization support
  • Memory Requirements: ~6GB VRAM (4-bit) / ~12GB VRAM (16-bit)
  • Inference Speed: Optimized for fast generation with vLLM support
  • Compatibility: Compatible with Transformers, Unsloth, and vLLM frameworks

🤝 Contributing

This model is part of ongoing research into improving roleplay AI capabilities. Feedback and suggestions for improvements are welcome!

📄 License

This model is released under the Apache 2.0 license, following the base model's licensing terms.

🙏 Acknowledgments

  • Base Model: Thanks to the DeepSeek team for the excellent base model
  • Training Framework: Built with Unsloth for efficient training
  • Dataset: Trained on the high-quality Gryphe-Aesir RPG dataset
  • Optimization: GRPO training implemented via TRL library

For more details about the training process and additional examples, please refer to the model card or reach out through the community discussions.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chun121/RP_DeepSeek-R1-0528-Qwen3-8B

Dataset used to train Chun121/RP_DeepSeek-R1-0528-Qwen3-8B