How do retrieval systems handle feedback expressed as negations rather than preferences?

This explores how retrieval systems cope with feedback phrased as what a user *doesn't* want ("not this") rather than what they do — and why negation is awkward for systems built on similarity matching.

This explores how retrieval systems handle negative feedback — "doesn't look right," "not for me" — when their underlying machinery is built to chase positive matches. The corpus suggests the tension is real and structural: embeddings measure association and similarity, not absence, so a negation has no natural home in retrieval space (Where do retrieval systems fail and why?). You can point a query *toward* "romantic"; there's no clean way to point it *away* from something. That mismatch forces a design choice, and the corpus shows two distinct answers.

The first strategy is translation: convert the negation into a positive before it ever touches the retriever. A few-shot LLM can take "doesn't look good for a date" and rewrite it as "prefer more romantic," turning a complaint into a retrievable preference without any fine-tuning (Can language models bridge the gap between critique and preference?). This is elegant precisely because it sidesteps the representation problem — the retriever keeps doing what it's good at (finding similar things), and the LLM absorbs the burden of inverting the critique into a direction worth searching.

The second strategy keeps the negation as a negation and uses it as contrast. In agentic RAG, training with *both* positive and negative step feedback — via DPO, which directly pits good retrieval chains against bad ones — outperforms methods that only reward final answers or learn from a single direction (Does supervising retrieval steps outperform final answer rewards?). Here negative feedback isn't a problem to be converted away; it's the more informative half of the signal. "This retrieval step was wrong" tells the system something a stream of "rights" cannot.

What's quietly important is that not all negative feedback means the same thing. Annotation responses decompose into genuine preferences, non-attitudes, and constructed-on-the-spot preferences — and treating them uniformly contaminates the models trained on them (Do all annotation responses measure the same underlying thing?). A "no" born of a real stable dislike is a different object from a "no" the user invented because you asked. Systems that flatten negations into a single thumbs-down inherit that noise. And there's a feedback-loop hazard underneath all of it: ranking systems that learn naively from observed behavior converge on degenerate equilibria that just amplify their own past choices unless selection bias is modeled explicitly (Why do ranking systems need to model selection bias explicitly?) — meaning a system that mishandles negative signal doesn't just lose information, it can entrench its own mistakes.

The thing you might not have known you wanted to know: "handling negations" splits into two philosophies that disagree on whether negation is a bug or a feature. Recommendation-style retrieval tends to *erase* the negation by transforming it into a preference; agentic, training-time retrieval tends to *preserve* it as the sharper learning signal. The reason isn't taste — it's that one operates at query time over a frozen index where negation can't be represented, and the other operates at training time where contrast is exactly what gradient methods consume.

Sources 5 notes

Where do retrieval systems fail and why?

RAG systems fail at three structural levels: adaptive triggering (fixed intervals waste context), semantic-task mismatch (embeddings measure association, not relevance), and mathematical limits (embedding dimension constrains representable document sets). These require fundamentally different retrieval approaches, not tuning.

Can language models bridge the gap between critique and preference?

Few-shot LLM prompting can convert natural negative feedback like "doesn't look good for a date" into positive preferences like "prefer more romantic," enabling retrieval systems to find better-matching recommendations without fine-tuning.

Does supervising retrieval steps outperform final answer rewards?

Fine-grained feedback on intermediate retrieval steps significantly boosts agentic RAG performance compared to final-answer-only rewards. DPO trained with both positive and negative step feedback outperforms PPO and single-direction training by directly contrasting good and bad retrieval chains.

Do all annotation responses measure the same underlying thing?

Behavioral science reveals that annotations contain genuine preferences, non-attitudes, and constructed preferences—distinguishable by consistency across measurement conditions. Treating them uniformly contaminates reward model training and downstream alignment.

Why do ranking systems need to model selection bias explicitly?

YouTube's multi-objective ranker uses MMoE for conflicting objectives and a shallow position tower to remove selection bias from training data. Without both mechanisms, models converge on degenerate equilibria that amplify their own past decisions.

How do retrieval systems handle feedback expressed as negations rather than preferences?

Sources 5 notes

Next inquiring lines