Can prompt optimization teach models knowledge they lack?
Explores whether sophisticated prompting techniques can inject new domain knowledge into language models, or if they're limited to activating existing training knowledge.
The knowledge injection survey makes this constraint explicit: prompt optimization "focuses on fully leveraging or guiding the LLM to utilize its internal, pre-existing knowledge." It does not retrieve from external sources. It does not update parameters. It works entirely within the model's existing knowledge distribution.
This is a hard ceiling, not a soft limitation. When a domain requires knowledge that the model was never trained on — proprietary documents, post-training regulations, specialized ontologies, organization-specific processes — no prompting strategy can supply it. The model can reorganize, foreground, or combine what it knows, but it cannot know what it was never trained to know.
The practical consequence shows up in two failure modes. First, models prompted to act as domain experts will confidently apply general-purpose reasoning patterns to domain-specific problems where those patterns don't hold. The prompt activates "medical reasoning" as a behavioral style, not as medical knowledge. Second, prompt performance depends on how thoroughly the domain is represented in pre-training — well-documented domains (clinical guidelines, legal statutes, financial regulations) are more promptable than proprietary or emerging domains.
This makes prompt-only domain specialization a form of retrieval from fixed memory. The memory can be searched more or less skillfully, but it can't be expanded. Every sophisticated prompting technique — few-shot examples, chain-of-thought elicitation, role specification — is fundamentally retrieval from training data, dressed as reasoning.
The implication is that the right question before choosing prompt optimization is not "how should we phrase this prompt?" but "is the required domain knowledge in the model's training distribution?" If yes, prompting is sufficient and efficient. If no, the investment must go into a different injection paradigm — dynamic retrieval, fine-tuning, or adapter layers.
Since Why do specialized models fail outside their domain?, there's a version of the ceiling problem in the opposite direction: models that are fully fine-tuned can know a domain deeply while losing general coverage. Prompt optimization avoids the cliff problem by not modifying parameters — but only by accepting the ceiling problem instead. Every approach involves a trade-off; this one chooses breadth over depth.
Reynolds & McDonell (2021) provide the upstream mechanism: few-shot prompting is "task location in the model's existing space of learned tasks" — not task learning. Alternative 0-shot prompts that communicate task intention through natural language semiotics match or exceed few-shot performance, confirming that the model already has the capability and the prompt's job is to locate it. Meta-prompt programming further extends this: the LLM itself can be prompted to write task-specific prompts, offloading the location search to the model's own understanding of its capabilities.
Source: Domain Specialization
Related concepts in this collection
-
Why do specialized models fail outside their domain?
Deep domain optimization creates sharp performance cliffs at domain boundaries. Specialized models generate plausible-sounding but ungrounded responses when queries fall outside their training scope, and often fail to signal their own ignorance.
the cost of the alternative: fine-tuning that modifies parameters creates a different failure mode
-
How do knowledge injection methods trade off flexibility and cost?
When and how should domain knowledge enter an AI system? This explores the speed, training cost, and adaptability trade-offs across four injection paradigms, and when each approach suits different deployment constraints.
this ceiling is specific to the fourth paradigm; others avoid it at different costs
-
Why do language models ignore information in their context?
Explores why language models sometimes override contextual information with prior training associations, and whether providing more context can solve this problem.
related: even when context provides new information, prior training associations can suppress it
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
prompt optimization cannot inject new knowledge — it can only activate knowledge the model already contains