How do LLMs lose information when translating natural language to formal logic?

This explores what gets dropped or distorted when an LLM rewrites a sentence as formal logic — the corpus points less at sloppy syntax and more at a deeper mismatch between how these models reason and what formalization demands.

This explores what gets dropped or distorted when an LLM converts natural language into formal logic — and the corpus reframes the loss as semantic, not syntactic. The most direct finding is that LLMs reliably produce well-formed logical expressions that are *wrong about meaning* Can large language models translate natural language to logic faithfully?. The errors aren't random: they cluster at scope ambiguity, quantifier precision, and predicate granularity — exactly the places where everyday language is loose and logic demands commitment. "Everyone loves someone" has two readings; English shrugs, first-order logic forces a choice, and that's where information leaks out.

Why does this happen? A second thread suggests the loss is baked into how these models reason at all. When you strip semantic content away and leave only the formal rules, LLM performance collapses — they're leaning on token associations and parametric commonsense, not symbolic manipulation Do large language models reason symbolically or semantically?. So when a model formalizes a sentence, it's not translating structure; it's pattern-matching surface meaning and hoping the symbols line up. Mechanistic work on syllogisms shows the same contamination from the inside: a content-independent reasoning circuit exists, but extra attention heads encoding world knowledge bias the output toward what's *plausible* over what's *valid* — and that bias grows with scale How do language models perform syllogistic reasoning internally?. The information lost in translation is partly information the model overwrites with its own priors.

The surprising turn is that *more* formalization makes it worse, not better. Full symbolic conversion strips away semantic nuance the logic can't carry, while pure language lacks structure — the sweet spot is partial abstraction that enriches natural language with selective symbolic scaffolding rather than replacing it, buying 4–8% accuracy gains Why does partial formalization outperform full symbolic logic?. In other words, the act of fully formalizing is itself a lossy compression step. A complementary fix is to stop asking the LLM to be both translator and reasoner: let it draft the symbolic form, then hand execution to a deterministic solver that returns machine-checkable error messages, which catches translation mistakes far better than the model critiquing itself Can symbolic solvers fix how LLMs reason about logic?.

There's a quieter form of loss worth naming too — the information that was never in the sentence to begin with. Faithful formalization requires surfacing unstated preconditions, and LLMs systematically fail to enumerate them unless explicitly forced, with accuracy jumping from 30% to 85% when prompted to make the implicit explicit Do language models fail at identifying unstated preconditions?. Logic is unforgiving about assumptions that language leaves tacit.

If you want the broadest framing, the autoformalization failure is one instance of a documented family of epistemic failure modes — including "Potemkin understanding," where a model explains a concept correctly but cannot apply it, suggesting explanation and execution run on functionally disconnected pathways Can LLMs understand concepts they cannot apply? How do LLMs fail to know what they seem to understand?. The information lost in translation, on this reading, is a symptom of a wider gap between tracking statistical patterns and actually manipulating meaning.

Sources 8 notes

Can large language models translate natural language to logic faithfully?

LLMs generate well-formed logical expressions that are semantically incorrect, with errors clustering at scope ambiguity, quantifier precision, and predicate granularity. The asymmetry suggests LLMs understand formal language better than they can generate it.

Do large language models reason symbolically or semantically?

When semantic content is decoupled from reasoning tasks, LLM performance collapses even with correct rules in context. Models rely on parametric commonsense and token associations rather than formal logical manipulation, constraining reasoning to training distribution semantics.

How do language models perform syllogistic reasoning internally?

LLMs implement a content-independent three-stage reasoning mechanism—recitation, middle-term suppression, mediation—that works across architectures. However, additional attention heads encoding world knowledge systematically bias conclusions toward semantically plausible rather than logically valid answers, with contamination increasing at larger scales.

Why does partial formalization outperform full symbolic logic?

QuaSAR and Logic-of-Thought both achieve 4-8% accuracy gains by enriching natural language with selective symbolic elements rather than replacing it. Full formalization loses semantic information; pure language lacks structure. Augmentation preserves both.

Can symbolic solvers fix how LLMs reason about logic?

Logic-LM divides cognitive labor by having LLMs formulate symbolic representations while deterministic solvers execute inference and provide machine-verifiable error messages. This structured feedback loop catches translation errors better than LLM self-critique, improving faithful reasoning without requiring perfect formalization.

Do language models fail at identifying unstated preconditions?

LLMs struggle not from lacking world knowledge but from failing to bring background conditions forward as relevant constraints. Prompting that forces explicit enumeration of preconditions raises accuracy from 30% to 85%, revealing the frame problem persists in statistical systems.

Can LLMs understand concepts they cannot apply?

Models can explain concepts accurately, fail to apply them, and recognize the failure—a triple pattern incompatible with human cognition. This indicates functionally disconnected explanation and execution pathways rather than simple knowledge gaps.

How do LLMs fail to know what they seem to understand?

LLMs show repeatable, empirically documented failure modes—from Potemkin understanding (correct explanation + failed application) to reasoning collapse under implicit constraints. These failures reveal gaps between statistical pattern-tracking and actual epistemic competence.

How do LLMs lose information when translating natural language to formal logic?

Sources 8 notes

Next inquiring lines