Conversational AI Systems Psychology and Social Cognition

Can chatbots learn new knowledge without losing their personality?

Character chatbots struggle to absorb domain knowledge through fine-tuning because it erases their distinctive personality traits. Can model merging techniques separate and preserve persona while adding factual knowledge?

Note · 2026-04-18 · sourced from Personas Personality

Character chatbots face a fundamental tension: they need domain knowledge to be useful, but sequential fine-tuning on knowledge datasets causes catastrophic forgetting of persona traits. Chamain (2024) solves this through a two-step model merging approach that exploits the architectural separation between knowledge and personality in transformer layers.

Step one: parameter-wise weight combination of task vectors (instruction-tuned models) and character vectors. This integrates factual knowledge without fully overwriting character behavior. Step two: layer-wise merging of the deeper layers of the character model, which carry more persona-specific stylistic information. The method retains approximately 80% of task-specific performance while maintaining character portrayal ability.

This is notable because it avoids three expensive alternatives: (1) collecting character-specific training data for every domain, (2) training from scratch, and (3) multi-task learning requiring balanced datasets. Model merging treats persona and knowledge as independently trained capabilities that can be composed post hoc.

The broader implication connects to Can we track and steer personality shifts during model finetuning?: personality and knowledge occupy partially separable subspaces in model parameters, and this separability can be exploited architecturally. Chamain works at the weight level while persona vectors work at the activation level, but both depend on the same underlying phenomenon — personality traits are localized enough to be preserved or steered independently of task knowledge.

Original note title

model merging can integrate domain knowledge into character chatbots without catastrophic forgetting of persona — layer-wise merging preserves style while parameter-wise merging adds knowledge