Does refusing explicit knowledge harm AI system performance?
AI systems trained purely on data without explicit domain knowledge may sacrifice interpretability, robustness, and fairness. This explores whether structured knowledge injection could mitigate these tradeoffs.
The AI field's romance with tacit knowledge — learning everything from data, refusing to incorporate explicit causal models, domain rules, or codified expertise — is creating avoidable problems. Polanyi's Revenge names the irony: Polanyi's paradox said we know more than we can tell. AI's tacit-learning agenda inverts this: we can tell (we have explicit domain knowledge) but AI refuses to hear it.
The pattern is pervasive. Researchers build Rubik's Cube solvers from billions of examples rather than accepting the eight simple rules governing the puzzle. Industry practitioners convert doctrine and standard operating procedures into "data" only to have the knowledge "learned back" from that data at enormous cost. Policy infrastructure for AI relies exclusively on massive datasets even when hard-won explicit knowledge exists.
The costs are direct:
Interpretability: When systems learn their own representations from raw data, there is no reason to believe their reasoning will be interpretable to humans. Explicit knowledge, by contrast, provides the structural vocabulary for explaining decisions — "the system applied rule X to fact Y." Pure tacit learning produces weights that serve no interpretive function.
Bias: Systems that learn from data inherit whatever is statistically dominant in that data, with no explicit signal to override it. Explicit knowledge can include normative corrections that data alone cannot supply ("do not discriminate by X, regardless of correlation").
Robustness: Tacit learners generalize in the direction of training distribution. Explicit rules can enforce invariances that data doesn't adequately represent. A system that "knows" a causal rule can maintain it when data is sparse or adversarially constructed.
The civilizational argument is pointed: human progress has been built on codification — approximate, aspirational, but explicit. The current AI agenda is running the opposite direction, learning to distrust codification in favor of raw statistical patterns. This is a historically unusual choice with consequences.
Connects directly to domain specialization: Can prompt optimization teach models knowledge they lack? confirms this is not a prompting problem — the explicit knowledge must enter at training time. Can organizing knowledge structures beat raw training data volume? shows that structured explicit knowledge injection at 0.3% of corpus size substantially closes the gap.
Source: Philosophy Subjectivity
Related concepts in this collection
-
Can prompt optimization teach models knowledge they lack?
Explores whether sophisticated prompting techniques can inject new domain knowledge into language models, or if they're limited to activating existing training knowledge.
confirms the problem: if explicit domain knowledge wasn't in training data, no prompt can supply it; tacit learning created the deficit
-
Can organizing knowledge structures beat raw training data volume?
Does structuring domain knowledge into taxonomies during training enable models to learn more efficiently than simply increasing the amount of training data? This challenges assumptions about scaling knowledge injection.
structured explicit knowledge injection is the fix; knowledge organization outperforms raw data volume
-
Does model access level determine which specialization techniques work?
Different specialization approaches require different levels of access to a model's internals. Understanding this constraint helps practitioners choose realistic techniques for their domain adaptation goals.
explicit knowledge injection is constrained by access tier; the Polanyi problem is most acute in black-box contexts
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
ais rejection of explicit domain knowledge in favor of tacit learning creates interpretability bias and robustness problems