Can users inject entirely new knowledge into models through prompting alone?

This explores whether a clever enough prompt can actually add knowledge a model never learned — or whether prompting can only rearrange and surface what's already inside.

This explores whether a clever enough prompt can actually add knowledge a model never learned. The corpus answers with an unusually clear no: prompting operates entirely inside the model's existing training distribution, so it can reorganize, surface, or emphasize what's already there — but it cannot supply foundational knowledge the model never absorbed Can prompt optimization teach models knowledge they lack?. That creates a hard ceiling. No prompt strategy compensates for a missing domain; the best it can do is activate the right corner of what the model already holds.

What makes this more than a definitional point is a second failure the corpus documents: even genuinely new information you place in the prompt can be ignored. When a model's trained associations are strong, parametric knowledge overrides the in-context material, and the model generates outputs that contradict what you just told it. Textual prompting alone often can't override those priors — changing the behavior required intervening in the model's internal representations, not rewording the prompt Why do language models ignore information in their context?. So there are really two walls: prompting can't inject knowledge that isn't latent, and sometimes can't even get the model to honor knowledge that is sitting right in front of it.

The same stubbornness shows up in a different guise when you try to prompt a model into a new persona. Most open models cling to their trained default personality and resist conditioning, with only a few flexible enough to adopt a prompted character Can open language models adopt different personalities through prompting?. The pattern rhymes: prompts steer most easily toward what the model already leans toward, and meet resistance when asked to override the grain of training.

Here's the twist that reframes the whole question. If prompting can't add knowledge, what is it doing? One line of work suggests the action is on the user's side: prompt engineering is an iterative process of minimizing the gap between the model's output and what the user already expects, so the final result is a co-production of the model's distribution and the user's own anticipated answer How much does the user shape what a model generates?. You're not teaching the model — you're steering it toward a target you carry. And the capability research points the same direction: base models already contain latent abilities (reasoning, for instance) that minimal training or even direct feature-steering can unlock, meaning post-training selects rather than creates Do base models already contain hidden reasoning ability?, Can we trigger reasoning without explicit chain-of-thought prompts?. The recurring lesson across the corpus is that the bottleneck is elicitation, not acquisition.

So the thing you didn't know you wanted to know: when people feel like a great prompt "taught" the model something, what actually happened is that they found the key to a room already built during training — and the limits of prompting are really the limits of what's behind that door.

Sources 6 notes

Can prompt optimization teach models knowledge they lack?

Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Can open language models adopt different personalities through prompting?

Research shows most open models fail to adopt prompted personalities, stubbornly retaining their trained ENFJ-like defaults. Only a few flexible models succeed. Combining role and personality conditioning improves results but doesn't fully overcome resistance.

How much does the user shape what a model generates?

Foundation Priors research shows prompt engineering as divergence minimization between synthetic output and user priors. The refinement process systematically steers generation toward what users already expect, making outputs co-productions of model and user subjectivity.

Do base models already contain hidden reasoning ability?

Five independent mechanisms—RL steering, critique fine-tuning, decoding changes, SAE feature steering, and RLVR—all elicit reasoning already present in base model activations. Post-training selects rather than creates reasoning; the bottleneck is elicitation, not capability acquisition.

Can we trigger reasoning without explicit chain-of-thought prompts?

SAE-identified reasoning features can be directly steered to match or exceed chain-of-thought performance across six model families. This reasoning mode activates early in generation and overrides surface-level instructions, suggesting latent reasoning is a fundamental capability independent of explicit prompting.

Can users inject entirely new knowledge into models through prompting alone?

Sources 6 notes

Next inquiring lines