Language Understanding and Pragmatics Psychology and Social Cognition Conversational AI Systems

Does better summary writing actually increase user engagement?

When AI systems generate more informative push notifications, do users engage more? This explores whether informativeness and engagement always align in real product contexts.

Note · 2026-02-23 · sourced from Social Media
What kind of thing is an LLM really? How do you build domain expertise into general AI models? How should researchers navigate LLM reasoning research?

LLM-generated summaries for social network push notifications were objectively more informative and customized than existing templates. They did not improve user engagement. The explanation is structural, not quality-related: a well-summarized notification body contains sufficient information that users do not need to open the notification to understand the content. The optimization target (informativeness) directly undermines the business metric (engagement/clicks).

This is an instance of Goodhart's Law operating through content quality: when you optimize for how informative a message is, you can succeed at informativeness while failing at the behavior the informativeness was supposed to drive. The information was meant to entice users to engage; instead, it satisfied their information need at the notification level.

Two compounding factors emerged from the experiments:

Voice alienation: LLM summarization transformed first-person user voice ("I'm looking for a plumber") into third-person reportage ("neighbor asks about plumbers"). This tonal shift alienated recipients by creating distance from the original social context. The content was more polished but less relational — it sounded like a news brief about a neighbor rather than a neighbor reaching out.

Optimization gap: Without a reward model specifically trained for engagement, or specific model training to tailor user preferences into content generation, in-context learning alone cannot shortcut established templates that have been iteratively refined over years. The control templates were the product of multiple iteration cycles; the LLM-generated alternatives were one-shot productions. Even when LLMs produce "better" content by linguistic quality metrics, they cannot automatically improve engagement metrics that require alignment with user behavioral patterns.

The broader pattern: LLM-generated content is best suited for rapid prototyping of new products but directly using it to improve metrics on mature products that have undergone years of A/B testing often fails. The same dynamic appeared in invitation emails — more informative, more personalized, but not more effective at driving sign-ups. Generic LLM-generated content cannot capture individual personal preferences without further training.

This connects to the alignment tax discussion: since Does preference optimization harm conversational understanding?, we see a parallel where optimizing for one communication quality (informativeness) erodes the behavioral outcome it was meant to serve (engagement). The mechanism differs — RLHF erodes grounding acts while informativeness optimization eliminates click-through motivation — but the pattern is the same: optimizing a proxy metric degrades the downstream target.


Source: Social Media

Related concepts in this collection

Concept map
14 direct connections · 148 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

more informative AI-generated content paradoxically reduces user engagement because informational sufficiency eliminates the need to click through