Reinforcement Learning for LLMs LLM Reasoning and Architecture

Can we decouple what pretraining and fine-tuning each improve?

Does scaling at different training stages produce distinct capability improvements? This matters because it could reveal whether knowledge and behavioral alignment are truly separate properties we can optimize independently.

Note · 2026-02-22 · sourced from Training Fine Tuning
What kind of thing is an LLM really? How do you build domain expertise into general AI models? How should researchers navigate LLM reasoning research?

Emulated Fine-Tuning (EFT) provides a principled method for sampling from a distribution that approximates combining pretraining at one scale with fine-tuning at another. This decoupling reveals: scaling up pre-training tends to improve factuality, while scaling up fine-tuning tends to improve helpfulness.

The mechanism: pretraining builds knowledge (factual storage across the parameter space), while fine-tuning shapes behavior (how that knowledge is surfaced in response to queries). These operate on different aspects of the model. Since Why does reasoning training help math but hurt medical tasks?, the decoupling has an architectural basis — pretraining enriches lower-layer knowledge, fine-tuning modifies upper-layer behavior.

A special case, LM up-scaling, avoids resource-intensive fine-tuning of large pretrained models by ensembling them with small fine-tuned models — essentially emulating the result of fine-tuning the large model. This consistently improves helpfulness and factuality across Llama, Llama-2, and Falcon families without additional training. The practical implication: you can get the benefits of fine-tuning a 70B model by fine-tuning a 7B model and combining the signals.

EFT also enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training. This is relevant to Does preference optimization damage conversational grounding in large language models? — if helpfulness and harmlessness are adjustable at test time, the fixed trade-off imposed by RLHF may be unnecessary.

The decomposition challenges the assumption that a model's capabilities are monolithic. Factual knowledge and behavioral alignment are not only distinct — they scale differently and can be independently manipulated. This has implications for deployment: rather than training one large, fully-tuned model, a pipeline of specialized components (large pretrained for knowledge + small tuned for behavior) may be more efficient and more controllable.


Source: Training Fine Tuning

Related concepts in this collection

Concept map
14 direct connections · 179 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

scaling fine-tuning improves helpfulness while scaling pretraining improves factuality — these are decoupled training-stage effects