6.5 Avoiding Overfitting and False Patterns
Key Takeaways
- Overfitting happens when a candidate builds a rule around incidental details from too few examples.
- False patterns often come from visual salience, English assumptions, or answer-choice attraction.
- The best rule is usually the simplest one that explains all examples and transfers cleanly.
- Error review should identify whether the miss came from overfitting, underchecking, or feature confusion.
The danger of a rule that is too personal
Overfitting means building a rule that explains one example but fails the wider pattern. In visual-symbolic practice it usually happens when a candidate latches onto a vivid detail. If the first image is a red triangle above a box, you might decide the label means red, triangle, above, or box. Only comparison across several examples tells you which rule is real.
A false pattern can feel convincing because the human brain is built to find order. That strength becomes a weakness when evidence is thin. With one label and one image, many rules are possible. When the same label appears with different objects but the same relation, the relation becomes the strong candidate. When a new example breaks your rule, revise it immediately rather than defending it.
Favor the simplest sufficient rule
A simple rule is not automatically correct, but it is the right starting point. Prefer the rule with the fewest unsupported assumptions. If "ka" precedes labels for three cups, three keys, and three stars, then "ka = plural" is simpler than "ka = three portable objects shown on the left." The longer rule fits the examples but adds details the evidence never required, and those extra details are exactly what break on the transfer item.
Overfitting signals
| Signal | What it suggests | Better response |
|---|---|---|
| Rule uses many visual details | Too specific | Strip to the changing feature |
| Rule fits only one example | Weak evidence | Find another contrast |
| Rule ignores a contradiction | Confirmation bias | Revise the rule |
| Rule depends on English wording | Outside assumption | Follow the item's own system |
| Rule cannot handle the new item | Poor transfer | Rebuild from the examples |
Answer choices actively encourage overfitting. A choice may reuse the symbol from the first example so it looks familiar, but familiarity is not proof. Ask whether that symbol tracks the tested feature across the whole set; if not, it is a distractor planted to catch the over-anchored test-taker.
Underfitting is the opposite trap
Underfitting uses a rule that is too broad. If you decide "pa means position" but the item distinguishes above from below, your rule is not specific enough to choose an answer. Good reasoning lands in the middle: specific enough to discriminate among the choices, broad enough to transfer across irrelevant changes. On the DLAB this balance is what lets strong candidates push from the 95 floor toward the 110 threshold needed for Category IV languages, because higher-difficulty items punish both extremes.
Build an error log
Keep a three-column error log: my rule, the contradiction, the corrected rule. For example: "I thought pa = circle; contradiction: pa also appeared with a square; corrected rule: pa = above." Recording only the right answer teaches nothing; recording the failed inference and its correction trains revision, which is the heart of fresh-rule reasoning. Finally, keep practice honest: because public official detail on DLAB item design is limited, use original invented-symbol drills rather than chasing supposed real DLAB items.
The learning target is the reasoning process, not memorized protected content, and overfitting control can be trained with any artificial system you build yourself.
The three engines of false patterns
Most wrong inferences trace to one of three sources. The first is visual salience: the brightest or biggest element grabs your hypothesis before any evidence supports it. The second is English transfer: you unconsciously assume the invented language orders words, marks plurals, or places adjectives the way English does. Because the DLAB deliberately constructs grammars that violate English habits, this assumption is punished by design. The third is answer-choice gravity: a distractor that reuses a familiar symbol pulls you toward it through mere recognition.
Naming these three engines in your error log lets you see which one keeps tripping you, and the fix differs for each: slow your first glance, suspend English defaults, and test choices against evidence rather than familiarity.
A worked overfitting trap
Consider examples where a red triangle is "ko" and a red circle is "ko," while a blue triangle is "mu." A hasty reader sees "ko" twice with triangles in the first example and decides "ko = triangle." But "ko" also labels a circle, and the real constant is red. The blue triangle confirms it: shape changed within "ko" but color did not, and "mu" arrives exactly when color changes. The overfit "triangle" rule survives one example and dies on the second. The lesson is mechanical: never lock a rule until at least one example would have broken a wrong version of it. If your rule has not yet survived a potential falsifier, it is a guess.
Calibrating between over- and underfitting
The sweet spot is the simplest rule that still discriminates among the answer choices. If the choices force a decision between above and below, "position" is too vague; if they merely test whether position matters at all, "above" may be more than you need. Let the answer set tell you how fine-grained your rule must be. This calibration is exactly what lets strong candidates clear the higher score bands, because the hardest items are engineered so that both a too-broad and a too-narrow rule fail.
What is overfitting in visual-symbolic reasoning?
Practice-style: You thought "pa" meant circle, but a square above a box is also labeled with "pa." What correction is most reasonable?
Which error-log entry is the most useful for improving on missed items?