Can activation sparsity patterns guide the selection of in-context learning demonstrations?

This explores whether the sparsity of a model's internal activations — how few or many neurons fire on a given input — can be used as a signal to pick or order the examples you put in a prompt for in-context learning (ICL).

This explores whether activation sparsity can guide which demonstrations you feed a model in-context — and the corpus has a surprisingly direct answer plus a web of supporting ideas around why it should work. The most on-the-nose result is a method that uses last-layer activation sparsity to order few-shot examples from sparse (treated as harder) to dense (treated as easier), getting solid gains with no external difficulty labels at all Can representation sparsity order few-shot demonstrations effectively?. So the literal answer is yes — sparsity isn't just a diagnostic, it's an actionable selection-and-ordering signal.

What makes this more than a one-off trick is *why* sparsity carries information about difficulty. Two notes supply the mechanism. Networks learn dense activations for inputs they've seen a lot and fall back to sparse activations for unfamiliar inputs Is representational sparsity learned or intrinsic to neural networks?, and that pattern intensifies under pressure: hidden states sparsify systematically as tasks get harder or drift out-of-distribution, acting as a selective filter rather than a breakdown Do language models sparsify their activations under difficult tasks?. Read together, these say sparsity is a learned, readable proxy for 'how unfamiliar/hard is this input to me' — which is exactly the latent variable a curriculum over demonstrations wants to sort on. The sparsity-guided method is essentially harvesting that signal.

But here's the twist a curious reader might not expect: *which* demonstrations you pick may matter less than how you arrange and frame them. One study shows that simply moving an identical demo block from the start of the prompt to the end can swing accuracy by up to 20% and flip nearly half the predictions — a spatial bias entirely independent of content How much does demo position alone affect in-context learning accuracy?. And for sequential or decision-making tasks, what the model needs isn't well-chosen isolated examples at all, but full same-environment trajectories — 'burstiness' — to generalize Why do trajectories matter more than individual examples for in-context learning?. So sparsity-based selection lives inside a larger design space where ordering and structure are co-equal levers.

There's also a cautionary thread worth pulling. Work on instruction tuning found models do nearly as well on semantically empty or wrong instructions as on correct ones — what transfers is knowledge of the output space, not the content Does instruction tuning teach task understanding or output format?. That raises a sharp question for any demonstration-selection scheme: are your sparsity-chosen examples teaching the task, or just steering format? The sparsity-as-difficulty story is compelling precisely because it claims to track genuine input familiarity rather than surface form — but it's the kind of claim that deserves the skeptical eye these format studies invite.

If you want the deeper rabbit hole, sparsity isn't only a selection signal — deliberately training *weights* to be sparse produces clean, human-interpretable circuits where neurons map to single concepts Can sparse weight training make neural networks interpretable by design?. That hints at a longer arc: the same property that lets you rank demonstrations by difficulty might eventually let you see *why* a given demonstration helps, turning prompt curation from an empirical art into something you can inspect.

Sources 7 notes

Can representation sparsity order few-shot demonstrations effectively?

Sparsity-Guided Curriculum In-Context Learning uses last-layer activation sparsity to order demonstrations from sparse (harder) to dense (easier), yielding considerable performance improvements. This approach requires no external difficulty labels and works across diverse in-context learning tasks.

Is representational sparsity learned or intrinsic to neural networks?

During pretraining, neural networks develop dense activations for familiar training data and default to sparse representations for unfamiliar inputs. This trend emerges without task-specific fine-tuning and reflects how models consolidate knowledge through exposure.

Do language models sparsify their activations under difficult tasks?

As task difficulty increases, LLM hidden states become substantially sparser in a localized, systematic way that correlates with task unfamiliarity and reasoning load. This sparsification acts as a selective filter stabilizing performance under OOD shift rather than a failure mode.

How much does demo position alone affect in-context learning accuracy?

Repositioning an identical demo block from prompt start to end swaps up to 20% accuracy and flips nearly half of predictions. This spatial effect operates independently of demo content and spans multiple task types.

Why do trajectories matter more than individual examples for in-context learning?

In-context learning for sequential decision-making requires full or partial trajectories from the same environment level, not just isolated examples. This structural property—trajectory burstiness—allows models to generalize across vastly different tasks without weight updates.

Does instruction tuning teach task understanding or output format?

Models trained on semantically empty or deliberately incorrect instructions achieve comparable performance to those trained on full correct instructions, achieving 43% vs random baseline 42.6%. The semantic content of instructions appears largely irrelevant; what transfers is knowledge of the output space.

Can sparse weight training make neural networks interpretable by design?

Training transformers with sparse weights creates compact, human-interpretable circuits where neurons correspond to simple concepts with clear connections. Ablation studies confirm these circuits are necessary and sufficient for task performance, though scaling beyond tens of millions of parameters while maintaining interpretability remains unsolved.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether activation sparsity can guide in-context learning (ICL) demonstration selection. The question remains open: does sparsity genuinely identify task-relevant or difficulty-ranked examples, or does the effect dissolve under newer models, training regimes, or evaluation harnesses?

What a curated library found — findings span 2023–2026, treat as dated claims:
- Activation sparsity at the last layer orders few-shot demos from sparse (harder) to dense (easier), yielding gains without external labels (2024–2025).
- Networks learn dense activations for familiar inputs; sparsity emerges under OOD shift or task hardness as an adaptive filter, suggesting sparsity proxies unfamiliarity (2024–2026).
- Demo *position* in the prompt creates a 20% accuracy swing and flips ~50% of predictions independent of content; sequential tasks require full same-environment trajectories, not isolated examples (2025).
- Instruction tuning teaches output-space distribution, not task semantics; models succeed on semantically empty/wrong instructions, raising whether sparsity-chosen demos teach tasks or steer format (2023).
- Weight sparsity produces interpretable disentangled circuits; this hints sparsity might eventually enable inspection of *why* a demo helps (2025–2026).

Anchor papers (verify; mind their dates):
- arXiv:2305.11383 (2023) — Do Models Really Learn to Follow Instructions?
- arXiv:2312.03801 (2023) — Generalization to New Sequential Decision Making Tasks with ICL
- arXiv:2507.22887 (2025) — Where to show Demos in Your Prompt: A Positional Bias of In-Context Learning
- arXiv:2603.03415 (2026) — Farther the Shift, Sparser the Representation

Your task:
(1) RE-TEST EACH CONSTRAINT. For sparsity-guided selection: does it survive on frontier models (o1, Claude 3.5 Sonnet, etc.) and on longer contexts or agentic setups (multi-turn, memory, caching)? Separate the durable claim ('sparsity correlates with model uncertainty') from the perishable one ('sparsity alone outperforms random selection'). Address the format-steering critique: does sparsity-ranked selection teach task structure or merely steer output distribution?
(2) Surface contradicting or superseding work from the last ~6 months. Does newer ICL scholarship on context length, retrieval, or dynamical systems flatten the sparsity signal, or deepen it?
(3) Propose 2 research questions that assume the regime may have shifted: (a) Can sparsity-guided selection survive or improve under in-context model updates (e.g., RL, DPO)? (b) Does sparsity guidance transfer across model families, or is it an artifact of training?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Can activation sparsity patterns guide the selection of in-context learning demonstrations?

Sources 7 notes

Next inquiring lines