Computer Adaptive Testing
Key Takeaways
- CAT selects each next item based on whether you answered the prior item correctly, targeting a question with about a 50% chance you answer it right.
- You cannot skip, mark for review, or return to a previous question — every answer is final.
- A harder-looking item often signals you are doing well; difficulty is not a sign you are failing.
- The algorithm estimates ability continuously and stops at exactly 100 items for the MLS form.
How The Adaptive Engine Works
Computer adaptive testing (CAT) does not deliver a fixed pre-printed form. The engine maintains a running estimate of your ability after every response. Answer an item correctly and the next item is harder; miss one and the next is easier. The BOC targets items where you have roughly a 50% chance of answering correctly, because items at your ability level carry the most measurement information. After 100 items the engine has a precise ability estimate and converts it to a scaled score.
Because the next item depends on your last answer, you cannot skip, change, or revisit any question. There is no "mark for review" panel and no back button. This is the single biggest behavioral difference from school exams, and it has concrete consequences:
- Read each stem carefully the first time — there is no second pass.
- Never leave an item blank hoping to return; you cannot.
- Commit to your best answer, then mentally close that item and focus forward.
- An unanswered item at the time limit counts against you, so always answer rather than run out the clock.
Why Hard Questions Are A Good Sign
Many candidates panic when items feel relentlessly difficult and conclude they are failing. The opposite is usually true. Because CAT pushes toward items you have about a 50% chance of getting right, a well-prepared candidate is fed progressively harder content as their ability estimate climbs. Sustained difficulty often means the engine has placed you at a high ability level. Conversely, a string of easy items can mean the estimate dropped. The correct mindset is to ignore perceived difficulty entirely — you cannot infer your score mid-exam, and trying to do so wastes time and breaks concentration.
CAT Versus A Fixed-Form Score
| Feature | Fixed-Form Test | MLS CAT |
|---|---|---|
| Item selection | Same for everyone | Tailored to your ability |
| Skip / return | Usually allowed | Not allowed — answers are final |
| What drives the score | Raw number correct | Difficulty of items you answer correctly |
| Two candidates' forms | Identical | Unique to each candidate |
| Best response to a hard item | May skip and return | Answer it; it may signal high ability |
Common trap: assuming there is a fixed "number correct" passing threshold. There is not. Two candidates can answer the same raw count correct and receive different scaled scores because they answered items of different difficulty. A candidate who answers 60 hard items right can outscore one who answers 70 easy items right. This is why third-party practice platforms that report "you got 72%" cannot predict your BOC outcome — they have no access to the BOC item-difficulty calibration. Use practice scores to find weak content areas, never to forecast pass/fail.
The only official outcome metric is the scaled score, covered in the scoring section.
How The Engine Decides Your Ability Estimate
Think of CAT as a continuous narrowing of a confidence interval around your true ability. The first item is drawn near the middle of the difficulty pool. Each subsequent response moves the estimate up or down and shrinks the uncertainty band. By item 100 the band is tight enough for the BOC to convert your ability estimate to a single scaled score. This is why the exam is exactly 100 items for MLS rather than variable-length — the BOC fixed the length and lets difficulty, not quantity, carry the precision.
Two consequences follow that change exam-day behavior:
- Early items matter disproportionately. Because the estimate starts near the middle, the first 10-15 answers set the trajectory of difficulty you will see. Read those carefully; do not warm up by guessing.
- A wrong answer is recoverable. One miss nudges the estimate down and feeds you a slightly easier item; answering that correctly pushes back up. No single item decides pass/fail, so a missed early question should not trigger panic that cascades into more misses.
Behaviors That Help Versus Hurt Under CAT
| Situation | Helpful Response | Harmful Response |
|---|---|---|
| An item feels very hard | Reason it out; it may signal a high estimate | Conclude you are failing and rush the rest |
| You realize you misread item 12 | Accept it; move on, recover on later items | Obsess over it through items 13-20 |
| A run of easy items appears | Stay careful; answer accurately | Assume the exam is "giving up" on you |
| Time is tight near item 90 | Answer every remaining item, best guess | Leave items blank to "save" the score |
Worked example of the recoverability principle: a candidate misses items 8 and 9 (two stains they confused). The estimate dips and items 10-12 are easier; they answer those correctly, the estimate climbs, and by item 30 the difficulty is back at a high level. Their final scaled score reflects the overall pattern of difficulty mastered, not those two early stumbles. Contrast a candidate who, rattled by the same two misses, second-guesses every subsequent answer and misses several more — that compounding behavior, not the original two errors, is what actually lowers the score. CAT rewards steady, forward-only focus.
Midway through the exam, the questions feel increasingly difficult. What does this most likely indicate?
Why can two candidates answer the same raw number of questions correctly yet receive different scaled scores?
Which exam-day behavior is REQUIRED by the MLS CAT format?