Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
The advancement of Large Language Models (LLMs) has spurred significant interest in Role-Playing Agents (RPAs) for applications such as emotional companionship and virtual interaction. However, recent RPAs are often built on explicit dialogue data, lacking deep, human-like internal thought processes, resulting in superficial knowledge and style expression. While Large Reasoning Models (LRMs) can be employed to simulate character thought, their direct application is hindered by attention diversion (i.e., RPAs forget their role) and style drift (i.e., overly formal and rigid reasoning rather than character-consistent reasoning). To address these challenges, this paper introduces a novel Role-Aware Reasoning (RAR) method, which consists of two important stages: Role Identity Activation (RIA) and Reasoning Style Optimization (RSO). RIA explicitly guides the model with character profiles during reasoning to counteract attention diversion, and then RSO aligns reasoning style with the character and scene via LRM distillation to mitigate style drift. Extensive experiments demonstrate that the proposed RAR significantly enhances the performance of RPAs by effectively addressing attention diversion and style drift.
With the advancement of Large Language Models (LLMs), Role-Playing Agents (RPAs) [1] have garnered significant attention for applications such as emotional companionship [2] and virtual interaction [3]. Many RPAs attempt to explicitly integrate rich role-playing dialogue datasets [4], diverse interaction processes [5], and the inherent generalization capabilities of LLMs [6], yielding promising results.
Despite their success, existing methods often focus merely on superficial knowledge and style [7] expression in responses, with models lacking deep, human-like internal thought processes. Large Reasoning Models (LRMs), such as GPT-o series [8] or Deepseek-R1 [9], can be utilized to generate structured reason traces to simulate a character’s thought process, thereby addressing this gap. However, Feng et al. [10] observed that reasoning methods do not effectively improve the performance of RPAs under certain circumstances. As illustrated in Figure 1, the primary reasons for this degradation are attention diversion and style drift. Firstly, existing LRMs tend to forget their designated role, concentrating instead on the task or problem-solving. This diminishes their focus on the role-playing task, leading to attention diversion. Secondly, they prioritize the generation of structured, logical, and formal reasoning processes, rather than the vivid and consistent self-perceptive style required for role-playing, resulting in style drift. These two challenges lead to a rigid thought, as illustrated in the upper right corner of Figure 1, resulting in the model generating responses inconsistent with the role.
To address these challenges, this paper introduces a novel Role-Aware Reasoning (RAR), designed to imbue LLMs with the capacity for deep thinking that aligns with character settings. Firstly, RIA aims to convert the character’s core features (such as personality, background, and manner of speech) into explicit, rule-like prompts to guide the model to think in a manner consistent with the character. Secondly, RSO utilizes specific system prompts to guide the LRM to generate reasoning traces that either align with (i.e., positive examples) or deviate from (i.e., negative examples) the requirements of specific scenarios. Subsequently, through contrastive learning, RSO enables the model to adjust the expression style of internal thoughts based on the current dialogue context. Ultimately, the model can adhere to the various settings stipulated in the role-playing requirements and dynamically switch between rigorous logic and vivid portrayal, thereby alleviating attention diversion and style drift.
This section elaborates on our proposed Role-Aware Reasoning (RAR). An overview of RAR is presented in Figure 1. LRMs tend to be overly rational and formal in their reasoning processes, lacking thought processes akin to specific characters. In contrast, RAR establishes role-aware requirements for LLMs through the distillation of reasoning traces. By enhancing role awareness through Role Identity Activation (RIA) and then introducing Reasoning Style Optimization (RSO), the model learns to master reasoning styles appropriate for different scenarios. Ultimately, it can produce internal thought processes that are both profound and consistent with the character settings in role-playing tasks.
RIA compels the model to always adopt the "character’s" perspective during its thought process. It internalizes character settings as reasoning constraints, ensuring high consistency between the reasoning process and the character’s identity, thereby effectively preventing situations where the model neglects role-playing requirements due to focusing solely on generating the current response.
Merely maintaining role identity is insufficient to fully simulate the contextual adaptability of a character’s thought process. The reasoning processes of existing LRMs tend to be structured, logical, and formal. This presents a significant "style drift" from the vivid, emotional, or uniquely styled thinking that is required for role-playing. A character might require rigorous logical thinking when conducting serious analysis, but a more emotional and imaginative internal monologue when expressing inner feelings or reminiscing about past events. To address this issue and enable the reasoning style to dynamically match character settings and the current dialogue scene, we introduce Reasoning Style Optimization (RSO). RSO endows the model with the ability to adjust the expressive form of its internal thoughts according to the context.
Existing research indicates that system prompts play a crucial role in the process of model response generation and can significantly influence the style of model replies [45]. We apply this principle to the generation of thought traces, controlling system prompts to alter the thinking style. First, we define different types of typical role-playing scenarios, including logical analysis scenarios XLogic and vivid interaction scenarios XStory. Concurrently, we establish two reasoning styles, each represented by a distinct system prompt: one focusing on facts (CFact) and the other on character knowledge (CKnow). These two reasoning prompts and two types of scenarios are combined pairwise. System prompts are applied to the scenarios, prompting the LRM to generate positive and negative examples:
Dataset The training dataset used in our experiments, RoleBench-Train [46], is derived from RoleBench. RoleBench was built by carefully selecting and processing scripts from 940 films and TV shows to create detailed profiles for 95 English-speaking characters, capturing their diverse personality traits. Based on these profiles, a total of 168,093 role-playing samples were generated, with 137,920 used for training. The quality of the data was evaluated by expert annotators along three dimensions, and results showed that the majority of the samples were of high quality.
Benchmark
To thoroughly evaluate the method proposed in this study, we used two publicly available benchmarks for role-playing abilities, each targeting distinct aspects of agent performance:
• SocialBench [7] evaluates an agent’s social intelligence through multiple-choice tasks across both individual and group interactions. It includes 500 character profiles, over 6,000 questions, and more than 30,800 multi-turn dialogues sourced from books, movies, and online platforms. The evaluation covers key social dimensions, such as role knowledge (Konw.), role style (Sty.), emotion detection (ED), situational understanding (SU), humor and sarcasm detection (HSD), long-term memory (MEM), and social preferences in group dynamics (Neu., Pos., Neg.).
• CharacterBench [47] contains 22,859 human-annotated samples and is designed to assess a model’s ability to construct and maintain consistent, expressive character personas. It spans 3,956 characters across four categories and 25 subcategories, and measures 11 core dimensions: memory consistency (MC), fact accuracy (FA), boundary consistency (BCK), attribute consistency (ACb for bot and ACh for human), behavior consistency (BCbP for bot and BCh P for human), emotion self-regulation (ES), empathetic responsiveness (ER), morality stability (MS), morality robustness (MR), human likeness (HL), and engagement (EG). Character- Bench also incorporates the CharacterJudge model for scalable and automated scoring. The bot-side evaluations (denoted with b) are conducted by automatically generating queries and responses via large language models, while the human-side evaluations (denoted with h) are based on manual annotation and interactions within real user scenarios.
Dialogue Generation As shown in Table 1, our proposed RAR outperforms all baseline methods across a majority of the evaluated dimensions on the CharacterBench benchmark.
Specifically, RAR demonstrates significant improvements in Persona-related metrics, including Memory Consistency, Attribute Consistency, and Behavior Consistency. These gains can be attributed to the Role Identity Activation (RIA) module, which continuously reinforces the character’s core traits, experiences, and motivations throughout the reasoning process, preventing the model from deviating from its assigned role. In terms of Knowledge (FA and BCK), RAR also shows strong performance, indicating that the role-aware reasoning helps in accurately recalling and applying character-specific knowledge while respecting established boundaries.
Furthermore, RAR excels in Believability. This can be attributed to the RSO’s ability to adapt the reasoning style, allowing the model to generate more human-like responses, thereby enhancing user engagement. Compared to the Distill baseline, which also incorporates reasoning traces, RAR’s superior performance highlights the benefits of targeted role awareness and style optimization over generic reasoning. Notably, while MoreThink attempts to enforce extended reasoning, its performance significantly degrades on several metrics, particularly persona consistency and memory, suggesting that unguided, lengthy reasoning can be detrimental. At the same time, specialized role-playing models like Neeko and CharacterGLM also fall short of RAR, indicating that RAR’s explicit modeling of internal thought processes leads to more robust and consistent character portrayal.
Social Interaction Table 2 presents the results on the SocialBench benchmark, which evaluates agents’ social intelligence. RAR again demonstrates superior performance, achieving the highest average score. Notably, RAR achieves top scores in Role Knowledge and Role Style, directly reflecting the strengths of the RIA and RSO modules, respectively. RIA ensures the model deeply understands and internalizes the character’s knowledge and background, while RSO enables it to adapt its communication style to be appropriate for the character and social context.
Secondly, RAR also shows strong performance in understanding social preferences. This suggests that the character’s standpoints and motivations, instilled by RIA, guide the model’s reasoning in complex social scenarios, leading to more appropriate and character-consistent social judgments. Furthermore, while performance on metrics like Situational Understanding, and Humor and Sarcasm Detection is competitive, there is still room for advancement, indicating the inherent difficulty of these social reasoning tasks.