Proxona: Leveraging LLM-Driven Personas to Enhance Creators' Understanding of Their Audience
we present Proxona, a system for defining and extracting representative audience personas from the comments. Creators converse with personas to gain insights into their preferences and engagement, solicit feedback, and implement evidence-based improvements to their content. Powered by large language models, Proxona analyzes audience comments, distilling the latent characteristics of audiences into tangible dimensions (classification categories) and values (category attributes). Proxona then clusters these into synthetic personas. Our technical evaluations demonstrated that our pipelines effectively generated relevant and distinct dimensions and values, enabling the deduction of audience-reflecting personas, while minimizing the likelihood of hallucinations in persona responses. Our user evaluation with 11 creators showed that Proxona supported creators to gain new insights about their audience, make informed decisions, and successfully complete content creation with high confidence.
comments or online communities, where direct communication between the creator and the audience occurs, offer creators chances to earn viewers’ sentiments and reactions [29]. However, creators often struggle to analyze large volumes of comments and extract actionable insights for their creative process. While highly upvoted comments may reflect popular opinions, viewer comments often lack the depth and diversity needed for truly understanding the full range of audience preferences.
Proxona generates persona representations that are fictional yet embody the diverse traits and characteristics of audience segments, represented with dimensions (e.g., interests, expertise level) and values (specific values within these dimensions) augmented from channel audiences’ written comments. Through Proxona, creators can explore their audience personas and understand them by reviewing associated dimensions and values, and browsing profiles (e.g., experiences, motivations behind watching the videos, etc.). To better meet creators’ needs, creators can engage in natural language conversations with the personas, asking for their opinions on their channel and video content. Consequently, creators can solicit actionable suggestions from the personas on their specific content as guidance for early-stage content development.
LLM-powered pipeline infers and predicts the audience’s complex characteristics from comments, employing the framework — dimensions and values— and clusters similar audiences into personas.
Based on this relevant dimension and value set, the audience groups generated with our pipeline were perceived as more homogeneous
By interacting freely with audience-driven personas, creators collected possible audience opinions, strengthened and enriched their content, and made informed decisions throughout their creative practices
strategic content planning becomes essential for gaining audience exposure
Creators presume an ‘imagined audience’ – ‘the mental image about people to communicate with’ [15], based on online or offline contextual clues [28]. This influences their selection of platforms for presentation [30] and the type of content to create. The way of how creators utilize ‘imagined audience’ is similar to how designers develop personas of target users using behavioral data to understand users’ contexts and align with their experiences [43, 45]. In identifying problems and developing alternative solutions to improve the user experience [36], users are often involved by sharing their challenges in person [26], providing feedback on prototypes [10], or co-designing [56] to ensure the products are effectively designed.
All participating creators (I1 - I13) unanimously agreed on the necessity and the importance of understanding their audience as a pivotal aspect of their creative process. Predominantly, creators used YouTube Studio, supplemented by video comments, as their main tools for gauging audience demographics and engagement patterns. Despite offering basic demographic data and aggregated interaction metrics like retention rate and clickstreams, creators felt these tools fell short of providing the depth of insight needed for a nuanced audience understanding. This made it difficult to apply these insights to content creation. 3.2.1 Difficult to Gain In-depth Insights about Their Audience. Our interviews indicated that only a few creators go beyond surface level analysis to deeply understand their audience. For instance, I4, specializing in car reviews, precisely defined his target demographic as ‘white-collar males in the United States, aged between 40-60, nearing retirement, predominantly white.’
insights about viewer preferences from reading comments—such as requests for more detailed techniques or specific video editing styles—accessing this level of useful feedback was not a universal experience. Often, creators felt that comments on videos tend to focus on surface-level aspects, such as emotional reactions, video’s topics, or editing quality, rather than offering useful feedback or expressing the viewers’ deeper motivations and needs
collaborative process where a creator iteratively creates and refines their video storyline based on conversations with (DG 2) and feedback from audience personas (DG 3). Specifically, the system employs LLM to generate audience personas, which is composed of dimensions and values personalized to each channel’s audience (DG 1). The system simulates potential messages from the audience personas by utilizing the creator’s channel and video data. To facilitate creators to create and refine their content to better target their audience, the audience personas provide specific feedback by evaluating the content from their perspective and suggesting actionable items, which help creators make decisions in their content production strategies.
Similarly, in Proxona, we adjust this persona method by collecting user data from existing video comments, deriving useful insights about the audience through the dimensionvalue framework, and developing concrete audience personas that combine these insights.
By leveraging real viewers’ comments, it is possible to make the personas more grounded and personalized to each creator. To achieve this, we employ large language models (LLMs) to analyze comments, identify key dimensions and values, and synthesize them into comprehensive audience personas. LLMs are particularly effective in this context because they can discern nuanced patterns and extract latent characteristics from large volumes of text, which traditional methods might overlook.
The persona construction framework we offer is tailored to each channel’s unique audience in a comprehensible manner, providing variant audience dimensions and values for each channel. These ‘dimensions’ are broad personal characteristic categories (e.g., hobbies, expertise levels, learning styles) the viewers of the channel possess, and ‘values’ are specific attributes associated with each dimension (e.g., basketball, novice, experiential). These dimensions and values, initially identified from the creators’ data with the help of LLMs, are used to analyze characteristics in audience comments and construct personas (See Figure 3).
The goal of persona generation in Proxona is not to create exact replicas of real-world audiences but to offer effective proxies that help creators better target their content. By constructing audience personas based on dimensions and values, we simplify complex audience information, making it more digestible and relatable. With personas derived from LLM-inferred information, creators can engage with these personas by asking questions about their preferences and requesting feedback on their early-stage content.
The persona construction framework we offer is tailored to each channel’s unique audience in a comprehensible manner, providing variant audience dimensions and values for each channel. These ‘dimensions’ are broad personal characteristic categories (e.g., hobbies, expertise levels, learning styles) the viewers of the channel possess, and ‘values’ are specific attributes associated with each dimension (e.g., basketball, novice, experiential). These dimensions and values, initially identified from the creators’ data with the help of LLMs, are used to analyze characteristics in audience comments and construct personas (See Figure 3).
Inspired by Luminate [48], we enable creators to extend the values under specific dimensions in two different ways: (a) manual addition and (b) getting suggestions from the system (Figure 1 - E). When creators find it difficult to depict new values, our pipeline suggests new values and recommends ones that are distinct from current ones.
To extract comprehensive and explicit audience characteristics, our pipeline utilizes an LLM (GPT-4) to observe possible audience characteristics described in each video’s comment data before deriving dimensions and values. By feeding each video title, description, and all the comments into GPT-4, our pipeline generates an audience observation summary for each video (Appendix F.1). Due to the large volume of data, we employ a method of compression with summarization; however, we ensure that this process focuses on capturing the essential information related to unique and inherent audience characteristics. This is why we first create an audience observation summary to distill the most relevant insights. Additionally, our pipeline generates transcript summary of each video to aid the LLM in contextual analysis of the audience from the comments (Figure 3 - A, Appendix F.2). By combining audience observation and transcript summaries, our pipeline extracts key dimensions and values representing possible audience characteristics for each creator’s channel (Figure 3 - B).
Still, opinions on persona consistency were varied. Most participants said the characteristics shown as values are well represented in their chat, which helped them further understand their audience personas in specific contexts. On the other hand, some participants mentioned that they observed repeated keywords and responses in some chats with personas, highlighting the need for more ‘humanness’ and ‘caprice’ like the real-world audience
We aim to answer our first research question:
• RQ 1: Can Proxona effectively generate relevant, distinct, audience-reflecting personas that provide evidence based responses?
Relevance was measured to determine whether the pipeline accurately captured audience characteristics specific to each channel, which are crucial for creating relevant personas. Mutual exclusiveness was evaluated to ensure that the dimensions and values were sufficiently distinct and diverse, allowing for the construction of distinct audience personas.
Only Channel A (Baking) exhibited overlapping dimensions (“Culinary Curiosity” vs. “Local Culinary Scene”), showing a slight ambiguity in differentiating these audience interests.
We compared our clustering method (Proxona) against conventional clustering methods (Baseline) of running k-means clustering comments without providing associated value information. Under the baseline condition, the clustering was performed by inputting only the comments, without including the dim-val set that represents the audience characteristics implied by each comment. This approach focused solely on the semantic similarity of the comments themselves.