How does training data distribution create asymmetric competence across relation types?
This explores how what's abundant (and what's missing) in training data leaves a model lopsided — fluent at some kinds of tasks or knowledge and unreliable at others — and reads 'relation types' broadly as the different categories of competence (procedural vs. factual, structured vs. open-ended, in-distribution vs. out-of-distribution) that training distributes unevenly.
This explores how the shape of training data hands a model uneven competence — strong at the kinds of tasks its data rewards, weak at the kinds it underrepresents. The corpus keeps finding the same shape from different angles, and the sharpest version is the split between knowing-how and knowing-that. One analysis of five million pretraining documents shows reasoning draws on broad, transferable procedural knowledge spread across many sources, while factual recall depends on narrow, document-specific memorization — so a model can be genuinely good at *how to do* a class of problems while being brittle on *the specific facts* a problem needs Does procedural knowledge drive reasoning more than factual retrieval?. The asymmetry isn't random; it tracks how each kind of competence is represented in the data.
The same lopsidedness shows up the moment you push past the training distribution. Chain-of-thought reasoning degrades predictably under shifts in task, length, or format — models reproduce the *form* of reasoning they saw without the underlying logic, so competence falls off a cliff exactly where the data thins out Does chain-of-thought reasoning actually generalize beyond training data?. What looks like general skill is often distribution-bounded fluency. And different domains don't even shift the model in the same direction: structured tasks (math, code) drive output entropy *down* while creative tasks drive it *up*, so training them together lets the structured domains' entropy collapse quietly damage open-ended capability — competence in one relation type actively erodes another unless you sequence them deliberately Does training order reshape how models handle different task types?.
The interesting twist is that the raw capability is often already there — the asymmetry is in what gets *elicited*, not what exists. Base models carry latent reasoning that minimal training merely selects and surfaces rather than creates Do base models already contain hidden reasoning ability?. So when a model is incompetent at some relation type, it can mean the data never taught the eliciting move, not that the ability is absent. Post-training choices then sculpt which competences come forward — and can make the asymmetry worse. Training on near-impossible problems teaches degenerate shortcuts that contaminate previously sound capabilities Do overly hard RLVR samples actually harm model capabilities?, and richer teacher context produces confident, concise traces that students inherit while losing the epistemic caution needed out-of-distribution — buying in-domain polish at the cost of generalization Does richer teacher context hurt student generalization?.
The thread worth pulling: competence asymmetry isn't only about facts the model never saw. It's also about *which interactional moves* the training reward distribution suppresses. Preference optimization tuned for single-turn helpfulness rewards confident answers over clarifying questions, cutting grounding behaviors 77.5% below human levels — so models stay competent at sounding helpful while quietly losing the relation type (multi-turn, mutual understanding) that the reward never priced in Does preference optimization harm conversational understanding?. Across all these, the lesson is the same one most users never expect: a model's strengths and blind spots are a fairly direct readout of what its training distribution over-weighted and what it left out — and you can often predict where it will fail before you ever test it.
Sources 7 notes
Analysis of 5 million pretraining documents shows reasoning relies on broad, transferable procedural knowledge from diverse sources, unlike factual recall which depends on narrow, document-specific memorization of target facts.
DataAlchemy experiments show CoT fails systematically under distributional shifts in task, length, and format. Models produce fluent but logically inconsistent reasoning — imitating reasoning form without valid underlying logic.
Omni-Thinker shows structured domains decrease output entropy while creative domains increase it. BWT-guided scheduling—training structured tasks first—yields 6.2% gains over joint training by preventing entropy collapse from damaging open-ended capabilities.
Five independent mechanisms—RL steering, critique fine-tuning, decoding changes, SAE feature steering, and RLVR—all elicit reasoning already present in base model activations. Post-training selects rather than creates reasoning; the bottleneck is elicitation, not capability acquisition.
Training on nearly-impossible problems causes models to learn degenerate shortcuts rather than genuine reasoning, and these shortcuts contaminate pre-existing capabilities. Group-relative normalization treats rare accidental successes as high-advantage trajectories, reinforcing answer repetition and computation-skipping instead of sound reasoning patterns.
Teachers conditioned on correct answers and verifier output produce confident, concise traces that students inherit. This style suppresses uncertainty expression, optimizing in-domain performance while degrading generalization to out-of-distribution problems that require epistemic caution.
RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.