Language Understanding and Pragmatics LLM Reasoning and Architecture Agentic and Multi-Agent Systems

What five design choices compose a world model?

World models are often presented as monolithic systems, but they actually involve five distinct design decisions—data preparation, representation, reasoning architecture, training objective, and decision integration—that can each fail independently. Understanding this decomposition helps diagnose why world model proposals fall short.

Note · 2026-05-03 · sourced from World Models

World model proposals often present themselves as monolithic — a video generator, a latent dynamics model, a foundation model. The Critiques of World Models essay argues this hides a structural fact: a world model is a composition of five distinct design choices, and any of them can be misaligned with the others. Treating the WM as a single thing makes it impossible to diagnose why it fails, because the failure could lie at any of the five layers — a decomposition that resolves the ambiguity flagged in Do LLMs actually have world models or just facts?.

The five aspects: (1) Identifying and preparing training data with the desired world information — what observations does the model see, and do they actually contain the structure needed for the intended downstream tasks? (2) Adopting a general representation space for the latent world state with possibly richer meaning than the observation data in plain sight — does the latent representation expose the right invariances for reasoning, or does it merely reconstruct the input? (3) Designing an architecture that allows effective reasoning over the representations — does the model support compositional, counterfactual, hierarchical operations, or only single-step prediction? (4) Choosing an objective that properly guides the model training — does the loss target the simulation-of-possibilities goal, or does it reward only observation reconstruction? (5) Determining how to use the world model in a decision-making system — how do the outputs of the WM feed into action selection, planning, or policy?

A WM that nails one or two of these and fails on the others is a coherent kind of failure: a video generator with stunning reconstruction quality (1, 2, 4) but no architecture for counterfactual queries (3) and no integration with decision-making (5) is not a world model in the functional sense, however impressive its outputs. Conversely, a model with rich representations but poor data coverage cannot simulate what its data did not expose.

The design pattern this exposes: when evaluating a proposal claiming to be a world model, decompose the claim into the five aspects and check each. Most of the disagreement in the WM literature is about which aspects matter and how they should be ordered, not about whether to build a WM at all. The five-aspect frame makes those disagreements explicit rather than letting them remain folded into vague terminology.


Source: World Models

Related concepts in this collection

Concept map
13 direct connections · 123 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

building a world model decomposes into five inseparable design choices — data representation architecture objective and decision-system integration