INQUIRING LINE

Why do non-experts default to familiar chart types despite domain complexity?

This reads the question not as a data-viz problem but as a familiarity-over-complexity problem — why people fall back on what they've seen most often rather than what the situation actually calls for — and tests it against what the corpus has learned about the same reflex in machines.


This explores why familiarity, not difficulty, governs which tool a non-expert reaches for — and the collection's most direct evidence comes from an unexpected place: studies of how reasoning systems break. The cleanest finding is that breakdowns are driven by instance-level *unfamiliarity*, not task-level *complexity* Do language models fail at reasoning due to complexity or novelty?. Systems succeed when the case in front of them resembles cases they've already absorbed, and stumble when it doesn't — regardless of how hard the problem 'should' be. Read sideways, that's a precise account of the chart-defaulting reflex: a bar chart isn't chosen because the data is simple, it's chosen because it's the most-rehearsed instance the person has on hand. Complexity in the domain doesn't change the inventory of familiar moves.

The collection also explains why the familiar move can feel competent while being wrong for the job. Chain-of-thought reasoning degrades predictably once you step outside the training distribution — producing output that's fluent and well-formed but logically disconnected from the actual structure of the problem, imitating the *form* of reasoning without the substance Does chain-of-thought reasoning actually generalize beyond training data?. A familiar chart applied to an ill-fitting domain is exactly this: the right shape, the wrong logic. The 'Potemkin understanding' pattern names the same trap — a correct-looking explanation paired with failed application, a surface that tracks patterns rather than genuine grasp How do LLMs fail to know what they seem to understand?. Defaulting to the familiar produces something that looks like analysis without being it.

What's quietly hopeful in the corpus is that the fix is rarely 'try harder' or 'know more' — it's *restructuring the workflow*. LLMs turn out to be far better forecasters than they appear, but only when the process separates numerical from contextual reasoning; monolithic, all-at-once prompting hides the capability entirely Can LLMs actually forecast time series better than we think?. Likewise, apparent reasoning 'collapses' often turn out to be execution bottlenecks, not knowledge gaps — the system knows the move but can't carry it out under load Are reasoning model collapses really failures of reasoning?. Translate that to a non-expert: the default chart isn't a knowledge failure, it's a structure failure. Given a scaffold that pulls apart 'what am I comparing' from 'how do I show it,' the better choice becomes reachable.

There's even a name for the in-between failure: wandering and underthinking — abandoning a promising path early and falling back to a safer, more rehearsed one Why do reasoning models abandon promising solution paths?. The non-expert who briefly considers a slope chart and retreats to a pie chart is underthinking in exactly this sense — a viable path exists but gets abandoned for the familiar before it's explored.

One honest caveat: this library is about AI and language models, not human visualization practice, so none of these notes studied chart choice directly. But the mechanism they converge on — that behavior is shaped by the distribution of what you've already seen, not by the demands of the case in front of you, and that the remedy is better *process structure* rather than more effort — transfers cleanly. The thing you didn't know you wanted to know: the chart-defaulting habit and an LLM's distribution-bounded reasoning are the same phenomenon wearing different clothes.


Sources 6 notes

Do language models fail at reasoning due to complexity or novelty?

LRMs don't break at complexity thresholds but at instance-novelty boundaries. Models fit instance-based patterns rather than generalizable algorithms, so any reasoning chain succeeds if trained on similar instances, regardless of length.

Does chain-of-thought reasoning actually generalize beyond training data?

DataAlchemy experiments show CoT fails systematically under distributional shifts in task, length, and format. Models produce fluent but logically inconsistent reasoning — imitating reasoning form without valid underlying logic.

How do LLMs fail to know what they seem to understand?

LLMs show repeatable, empirically documented failure modes—from Potemkin understanding (correct explanation + failed application) to reasoning collapse under implicit constraints. These failures reveal gaps between statistical pattern-tracking and actual epistemic competence.

Can LLMs actually forecast time series better than we think?

LLMs have stronger intrinsic forecasting ability than recognized, but only when workflows separate numerical reasoning from contextual reasoning. Monolithic prompting obscures this capability; structured decomposition surfaces it.

Are reasoning model collapses really failures of reasoning?

Models confined to text-only generation cannot execute multi-step procedures at scale, even when they know the underlying algorithm. Tool-enabled models solve problems beyond the supposed reasoning cliff, suggesting the bottleneck is procedural execution bandwidth.

Why do reasoning models abandon promising solution paths?

Reasoning LLMs exhibit two reinforcing failures: wandering (invalid exploration) and underthinking (premature path-switching). Decoding-level interventions like thought-switching penalties improve accuracy without fine-tuning, suggesting viable solutions exist but are abandoned prematurely.

Next inquiring lines