What inductive biases help networks segregate entities from raw inputs?
This explores the architectural and training 'inductive biases' — the built-in tendencies — that let a network pull discrete structure (modules, concepts, entities) out of undifferentiated input, rather than treating everything as one undifferentiated blob.
This reads the question as being about what built-in tendencies push a network to carve structure out of raw, unlabeled input — to separate the parts rather than smear them together. A caveat worth stating up front: this collection is centered on language models, not on the object-centric vision work where 'entity segregation from pixels' is usually studied, so the most direct answer is oblique. But several notes converge on the deeper principle, which is that segregation is something networks *acquire as a bias*, not something you have to hand-wire.
The sharpest piece of evidence is that networks segregate structure on their own. Pruning experiments show neural networks spontaneously decompose compositional tasks into isolated subnetworks — ablate one and only its corresponding function breaks Do neural networks naturally learn modular compositional structure?. That's entity-segregation at the level of *function* rather than perception: the network keeps the parts separable. Crucially, the same note finds that pretraining makes this modular structure far more consistent — so the strongest inductive bias here isn't an architecture choice, it's prior exposure.
Architecture still tilts the odds, though. Depth specifically buys you compositional separation: deep-and-thin sub-billion models beat balanced ones because layers let the network *compose abstract concepts* step by step rather than spreading capacity sideways across width Does depth matter more than width for tiny language models?. Segregating entities is exactly this kind of hierarchical build-up — primitives at the bottom, composed objects higher up — and depth is the bias that supports it.
The representational side adds a twist you might not expect: a network's tendency to use *dense* vs. *sparse* codes is itself learned from data familiarity, not fixed by the architecture. Models develop dense activations for familiar inputs and fall back to sparse ones for unfamiliar inputs, with no task-specific fine-tuning Is representational sparsity learned or intrinsic to neural networks?. Since sparsity is one of the classic levers for forcing a network to allocate distinct units to distinct entities, this says the very property you'd lean on for segregation is itself a moving, experience-shaped target.
The honest synthesis: the corpus argues that the bias that matters most for pulling entities apart is *prior training* — pretraining sharpens modularity, familiarity reshapes sparsity — with depth as the architectural assist for hierarchical composition. What it does *not* contain is dedicated object-centric / perceptual-binding work (slot attention, scene decomposition from pixels), so if your real question is about vision-style segregation, this collection answers the principle but not that specific literature.
Sources 3 notes
Pruning experiments reveal that neural networks implement compositional subroutines in isolated subnetworks, with ablations affecting only their corresponding function. Pretraining substantially increases the consistency and reliability of this modular structure across architectures and domains.
MobileLLM shows deep-and-thin architectures yield 2.7–4.3% accuracy gains over balanced designs at 125M–350M scale by composing abstract concepts through layers rather than spreading parameters across width.
During pretraining, neural networks develop dense activations for familiar training data and default to sparse representations for unfamiliar inputs. This trend emerges without task-specific fine-tuning and reflects how models consolidate knowledge through exposure.