← All notes

How do domain training techniques actually reshape model behavior?

Methods for specializing language models in specific domains, their effectiveness mechanisms, and their often-overlooked performance trade-offs.

Topic Hub · 38 linked notes · 10 sections
View as

SFT-RL Training Dynamics

2 notes

Why does SFT-then-RL training follow a predictable three-phase pattern?

When expert data diverges from a model's learned patterns, SFT-then-RL training exhibits disruption, readaptation, and overfitting phases. Understanding this progression could improve how we combine imitation and reinforcement learning.

Explore related Read →

Does RL training collapse format diversity in pretrained models?

Exploring whether RL fine-tuning systematically selects one output format from pretraining while suppressing others, and how this selection mechanism drives performance gains.

Explore related Read →

Training Quality and Compatibility

5 notes

Does critiquing errors teach deeper understanding than imitating correct answers?

Can training models to critique flawed responses build better structural understanding than standard supervised fine-tuning on correct answers? This matters because it reveals whether deep reasoning requires engaging with failure modes rather than pattern matching.

Explore related Read →

Does teacher-refined data always improve student model performance?

Explores whether higher-quality training data from teacher models uniformly benefits student models, or if compatibility with the student's current learning state matters for effective instruction.

Explore related Read →

Why does reasoning training help math but hurt medical tasks?

Explores whether reasoning and knowledge rely on different network mechanisms, and why training one might undermine the other across different domains.

Explore related Read →

Why do LLMs struggle to connect unrelated entities speculatively?

LLMs reliably organize and summarize evidence but fail when asked to speculate about connections between dissimilar entities. Understanding this failure could reveal fundamental limits in how models handle complex analytical reasoning.

Explore related Read →

Does fine-tuning weaken how reasoning steps influence answers?

When models are fine-tuned on domain-specific tasks, do their chain-of-thought reasoning steps actually causally drive the final answer, or do they become decorative? This matters because accurate outputs can mask unfaithful reasoning.

Explore related Read →

Alignment Data Efficiency

2 notes

Can 1000 carefully chosen examples align models effectively?

Does alignment require massive datasets, or can strategic curation of small, high-quality examples achieve comparable performance? LIMA tests whether quality beats quantity in post-training.

Explore related Read →

Can aligned LLMs generate their own training data?

Does feeding an aligned model only its prompt template cause it to self-synthesize high-quality instructions? This explores whether alignment training encodes a latent instruction-generation capability.

Explore related Read →

Self-Generated Domain Training Data

2 notes

Does self-generated training data improve model learning?

Can models learn more effectively from training data they generate themselves rather than data created by external sources? This explores whether a learner's own restructuring process produces better learning outcomes.

Explore related Read →

Can synthetic dialogues become realistic through layered diversity?

Explores whether combining persona variation, subtopic specificity, and contextual grounding can generate synthetic dialogues that match real conversational data quality and capture the full spectrum of dialogue diversity.

Explore related Read →

Knowledge Graph-Based Domain Specialization

1 note

Parameter-Efficient and Training Techniques

8 notes

Can decoding-time tuning preserve knowledge better than weight fine-tuning?

Explores whether applying alignment signals at inference time rather than modifying model weights can better preserve the factual knowledge learned during pretraining while still achieving alignment goals.

Explore related Read →

Can models learn multi-token concepts during fine-tuning?

Does training models to predict multiple tokens at once, rather than one token sequentially, help them form coherent semantic units? This matters because current next-token prediction fragments concepts like "ribonucleic acid" into arbitrary subword pieces.

Explore related Read →

Can isolating task-specific parameters prevent multi-task fine-tuning interference?

Explores whether identifying and protecting task-specific parameter regions can prevent the performance degradation that occurs when fine-tuning models on multiple tasks simultaneously. This matters because it could enable safe multi-task adaptation without sacrificing individual task performance.

Explore related Read →

Can we train better models on less data?

Can gradient-based influence estimation identify which instruction data actually matters most? The research explores whether selecting small subsets of training data by their similarity to target capabilities might outperform training on everything.

Explore related Read →

Can semantic knowledge shift model behavior like reinforcement learning does?

Can textual descriptions of successful reasoning patterns, prepended as context, achieve the same distribution shifts that RL achieves through parameter updates? This matters because it could eliminate the need for expensive fine-tuning on limited data.

Explore related Read →

Can context playbooks prevent knowledge loss during iteration?

When AI systems iteratively refine their instructions and memories, do structured incremental updates better preserve domain knowledge than traditional rewriting? This matters because context degradation undermines long-term agent performance.

Explore related Read →

Can models dynamically activate expert skills at inference time?

Can language models efficiently discover and compose task-specific capabilities on the fly without modifying base weights? This explores whether test-time adaptation through expert vector composition outperforms fixed fine-tuning approaches.

Explore related Read →

Does reasoning rely on procedural knowledge or factual memorization?

Explores whether LLMs learn reasoning through general procedural patterns across documents or through memorizing specific facts. Understanding this distinction matters for training data strategy.

Explore related Read →

Verifier-Free and Multi-Task RL

2 notes

Can reasoning RL work without verifying generated answers?

Most reasoning RL methods require answer verification, limiting them to math and code. Can models be trained to reason better in domains like medicine and law where verification is impractical?

Explore related Read →

Does training order reshape how models handle different task types?

Explores whether the sequence of multi-task RL training systematically affects model capabilities across structured and creative domains, and whether this ordering effect can be predicted and optimized.

Explore related Read →

RLVR Extensions to General Domains

6 notes

Can breaking down instructions into checklists enable better reinforcement learning?

Explores whether decomposing instruction quality into verifiable yes/no criteria allows RL systems to improve on tasks that lack clear correctness signals, like creative writing or social reasoning.

Explore related Read →

What makes rubric-based reward learning resistant to exploitation?

Rubric-based RL systems face reward hacking vulnerabilities. This explores what design patterns, architectural mechanisms, and iterative defenses enable rubrics to remain robust against model exploitation across diverse tasks.

Explore related Read →

Can model confidence alone replace external answer verification?

Can LLMs use their own certainty signals instead of external verifiers to improve reasoning? This matters for scaling beyond domains where correct answers can be automatically checked.

Explore related Read →

Can reasoning emerge from expert demonstrations alone?

Can AI systems learn to reason about non-verifiable tasks by studying expert examples rather than explicit reward signals? This matters because many high-value domains like medicine and law have abundant demonstrations but no automated verifiers.

Explore related Read →

Can adaptive guidance from solution traces reduce reward sparsity in RL?

When reinforcement learning struggles with hard problems due to sparse rewards and zero-advantage rollouts, does providing partial solution traces as adaptive guidance help the model learn more efficiently? This matters because standard RL wastes compute on unsolvable problems.

Explore related Read →

Why does RLVR training narrow a model's problem solving ability?

RLVR's on-policy constraint may force models to exploit known reasoning paths rather than explore new ones, potentially shrinking their effective problem-solving scope. Understanding this mechanism could reveal how to design better exploration incentives in language model reasoning.

Explore related Read →

Pass 3 Additions (2026-05-03)

4 notes

Can reconstructing expert thinking improve reasoning transfer?

Expert texts show only the final result of complex thinking. Can we reverse-engineer those hidden thought processes and use them to train models that reason better across different domains?

Explore related Read →

Why do language models need so much more text than humans?

Language models train on the surface of written text, but humans learn by inferring the underlying thoughts behind what they read. Does this explain why models need vastly more data to reach human-level understanding?

Explore related Read →

Can agents learn beyond what their training data shows?

Explores whether supervised fine-tuning on expert demonstrations creates a hard ceiling on agent competence, or whether agents can generalize to scenarios their curators never captured.

Explore related Read →

How do quality, diversity, and complexity affect synthetic data differently?

When training models on synthetic data, do quality, diversity, and complexity each play distinct roles in how well models generalize? Understanding their separate effects could explain why current optimization strategies fail.

Explore related Read →