INQUIRING LINE

Can gradient-based influence scores beat difficulty metrics for identifying valuable training data?

This pits two ways of deciding which training examples are worth keeping — gradient-based influence (does this example pull the model toward the target skill?) against difficulty metrics (how hard or redundant is this example?) — and asks which actually finds the valuable data.


This explores two rival philosophies of data selection: gradient influence, which asks whether an example moves the model toward a specific target capability, versus difficulty scoring, which ranks examples by how hard or redundant they are independent of any goal. The corpus suggests the answer isn't a clean win for either — it depends on what 'valuable' means and for whom.

The gradient camp's strongest evidence is striking: LESS uses low-rank gradient features to pick the 5% of instruction data whose learning signal most resembles the target task, and training on that sliver beats training on the whole dataset Can we train better models on less data?. The reason is the interesting part — mixed datasets don't just dilute, they actively hurt, because some examples shift the model's reasoning strategy away from what the task needs. Gradient influence is targeted: it scores an example relative to a destination. Difficulty metrics don't know the destination at all.

Difficulty scoring has its own impressive result, though. Ranking examples by difficulty signals like EL2N, forgetting, and memorization, then dropping the easy redundant ones, lets data pruning beat power-law scaling — half of CIFAR-10 removed with no accuracy loss, and the approach scaled to ImageNet with self-supervised metrics Can we prune training data without hurting model performance?. So difficulty is cheap, task-agnostic, and powerful when your goal is general capability rather than a narrow target. The honest read: gradient influence wins when you have a specific target distribution to aim at; difficulty wins when you're compressing a general corpus and don't.

Where the corpus gets genuinely useful is in showing that difficulty alone can be a trap, which quietly argues for influence-style thinking. Overly hard RLVR samples don't just waste compute — they induce degenerate shortcuts that contaminate skills the model already had, because rare accidental successes get treated as high-value trajectories Do overly hard RLVR samples actually harm model capabilities?. And teacher-refined data that is objectively higher quality still degrades a student when it sits past the student's learning frontier Does teacher-refined data always improve student model performance?. Both findings say the same thing: 'hard' or 'high quality' in the abstract is the wrong axis. What matters is whether the example is compatible with where this particular model can actually move — which is exactly the relational question gradient influence tries to answer and raw difficulty cannot.

The thing you didn't know you wanted to know: the real competition isn't influence-vs-difficulty as scoring formulas, it's targeted-vs-agnostic as goals. Difficulty asks 'is this example hard?' Influence asks 'is this example hard *in the direction I want to go*?' The mounting evidence that mismatched-but-impressive data backfires suggests the field is drifting toward the second question — even when it still uses cheap difficulty proxies to approximate it.


Sources 4 notes

Can we train better models on less data?

LESS uses low-rank gradient features to select instruction data most similar to target capabilities, and training on the selected 5% consistently outperforms full dataset training. The improvement occurs because mixed datasets contain examples that actively hinder specific skills by shifting reasoning strategy away from task requirements.

Can we prune training data without hurting model performance?

Research shows that ranking training examples by difficulty (EL2N, forgetting, memorization) and removing easy ones beats power-law scaling laws. On CIFAR-10, 50% of data was pruned without accuracy loss, and self-supervised metrics scaled the approach to ImageNet.

Do overly hard RLVR samples actually harm model capabilities?

Training on nearly-impossible problems causes models to learn degenerate shortcuts rather than genuine reasoning, and these shortcuts contaminate pre-existing capabilities. Group-relative normalization treats rare accidental successes as high-advantage trajectories, reinforcing answer repetition and computation-skipping instead of sound reasoning patterns.

Does teacher-refined data always improve student model performance?

Teacher-refined data degrades performance when it exceeds the student's learning frontier, even if objectively higher quality. Students should filter refinements using their own statistical profile to retain only compatible improvements.

Next inquiring lines