ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
Abstract
ChARM, a character-focused adaptive reward model, improves preference learning for role-playing language agents by using an act-adaptive margin and self-evolution with unlabeled data, achieving superior results on dedicated benchmarks.
Role-Playing Language Agents (RPLAs) aim to simulate characters for realistic and engaging human-computer interactions. However, traditional reward models often struggle with scalability and adapting to subjective conversational preferences. We propose ChARM, a Character-based Act-adaptive Reward Model, addressing these challenges through two innovations: (1) an act-adaptive margin that significantly enhances learning efficiency and generalizability, and (2) a self-evolution mechanism leveraging large-scale unlabeled data to improve training coverage. Additionally, we introduce RoleplayPref, the first large-scale preference dataset specifically for RPLAs, featuring 1,108 characters, 13 subcategories, and 16,888 bilingual dialogues, alongside RoleplayEval, a dedicated evaluation benchmark. Experimental results show a 13% improvement over the conventional Bradley-Terry model in preference rankings. Furthermore, applying ChARM-generated rewards to preference learning techniques (e.g., direct preference optimization) achieves state-of-the-art results on CharacterEval and RoleplayEval. Code and dataset are available at https://github.com/calubkk/ChARM.
Community
- We propose ChARM, a novel reward modeling framework, designed to provide accurate rewards for enhancing role-playing abilities in RPLA, dynamically adjusting optimization strength through an act-adaptive margin and leveraging self-evolution to expand training data.
- We train a ChARM-based reward model on Qwen2.5-7B, which outperforms the traditional Bradley-Terry model by 13% in preference ranking. When combined with DPO, it achieves stateof-the-art performance on both CharacterEval and our newly developed role-playing benchmark RoleplayEval.
- We create the first role-playing preference dataset RoleplayPref, with 1,108 characters across 13 subcategories and 16,888 bilingual dialogues. Additionally, we design a new evaluation benchmark RoleplayEval to advance research in this area.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment (2025)
- PsyMem: Fine-grained psychological alignment and Explicit Memory Control for Advanced Role-Playing LLMs (2025)
- OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction (2025)
- COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values (2025)
- Generative RLHF-V: Learning Principles from Multi-modal Human Preference (2025)
- Energy-Based Reward Models for Robust Language Model Alignment (2025)
- DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper