← All posts

The Base Rate Fallacy: When a Positive Test Isn't Good Evidence

Morgan Voss·

A veterinary clinic has a diagnostic test for a rare feline condition. The test is 99% accurate: it correctly identifies cats with the condition 99% of the time, and correctly clears healthy cats 99% of the time. Your cat tests positive. How worried should you be?

Most people, including many clinicians, answer: very worried. The test is almost never wrong. But the correct answer depends on a number that the test accuracy says nothing about: how common the condition is in the first place.

Running the Numbers

Suppose the condition affects 1% of cats. Imagine testing 10,000 cats.

Of those 10,000, roughly 100 have the condition. The test correctly identifies 99 of them. One is missed (a false negative).

The remaining 9,900 are healthy. The test correctly clears 9,801 of them. But 1% of 9,900 cats, that is 99 cats, test positive despite being perfectly healthy. These are false positives.

Now count the positive tests: 99 true positives plus 99 false positives, totaling 198 positive results. Of those 198, only 99 are genuine cases. The probability that a positive test actually indicates the condition is 99/198=0.5099/198 = 0.50.

A positive result from a 99% accurate test, applied to a population where the condition is 1% prevalent, is still only 50/50. The rare base rate has done something that feels impossible: it made half the positive tests wrong.

Bayes' Theorem

The formal version of this calculation is Bayes' theorem:

P(diseasepositive)=P(positivedisease)P(disease)P(positive)P(\text{disease} \mid \text{positive}) = \frac{P(\text{positive} \mid \text{disease}) \cdot P(\text{disease})}{P(\text{positive})}

Define the terms:

  • Sensitivity: P(positivedisease)=0.99P(\text{positive} \mid \text{disease}) = 0.99
  • Prevalence: P(disease)=0.01P(\text{disease}) = 0.01
  • Specificity: P(negativeno disease)=0.99P(\text{negative} \mid \text{no disease}) = 0.99, so P(positiveno disease)=0.01P(\text{positive} \mid \text{no disease}) = 0.01

The denominator requires the law of total probability:

P(positive)=P(positivedisease)P(disease)+P(positiveno disease)P(no disease)P(\text{positive}) = P(\text{positive} \mid \text{disease}) \cdot P(\text{disease}) + P(\text{positive} \mid \text{no disease}) \cdot P(\text{no disease})

=0.99×0.01+0.01×0.99=0.0099+0.0099=0.0198= 0.99 \times 0.01 + 0.01 \times 0.99 = 0.0099 + 0.0099 = 0.0198

Substituting:

P(diseasepositive)=0.99×0.010.0198=0.00990.0198=0.50P(\text{disease} \mid \text{positive}) = \frac{0.99 \times 0.01}{0.0198} = \frac{0.0099}{0.0198} = 0.50

The formula confirms what the counting argument showed. This quantity is called the positive predictive value (PPV), and it depends on prevalence in a way that test accuracy alone cannot capture.

Prevalence Changes Everything

The PPV is not fixed by the test. Vary the prevalence and the result shifts dramatically.

At 10% prevalence, the same test gives P(diseasepositive)0.92P(\text{disease} \mid \text{positive}) \approx 0.92. At 0.1% prevalence, it drops to about 0.090.09: nine out of every ten positives are false alarms. The test hasn't changed. The population being screened has.

This is why mass screening programs for rare conditions must account for the base rate carefully. A test that performs well in a high-risk population can be nearly useless in a general population screen, not because the test is bad, but because the denominator has changed.

The Prosecutor's Fallacy

A related error goes by the name the prosecutor's fallacy. It involves confusing $P(\text{positive} \mid \text{no disease})$ with P(no diseasepositive)P(\text{no disease} \mid \text{positive}). The first is the false positive rate. The second is what actually matters for deciding whether a positive result is meaningful.

In courtroom arguments, this error appears as: "The probability of this evidence occurring by chance is only 1 in a million, therefore the defendant is almost certainly guilty." That conclusion requires knowing the prior probability of guilt, which is separate from the probability of the evidence. Treating the two as interchangeable is the fallacy.

The cat testing scenario makes the structure visible. A 1% false positive rate does not mean a positive test result has only a 1% chance of being wrong. It means the test generates a false positive on 1% of healthy animals. What fraction of positive results are false depends on how many healthy animals are being tested relative to sick ones.

What This Changes

For anyone interpreting test results, the lesson is that sensitivity and specificity describe the test. Positive predictive value describes the test applied to a specific population. These are different things. Quoting a test's accuracy without specifying the prevalence context is giving half the information needed.

The math is not complicated. The step that requires deliberate effort is remembering to ask for the base rate at all. It is easy to look at a positive result from a highly accurate test and conclude the diagnosis is settled. It often isn't. The prior probability of the condition is load-bearing, and ignoring it leads to conclusions that the numbers don't actually support.

The cats don't charge. The site doesn't either. If something here helped a concept click, a small tip is appreciated.

Buy the Cats a Treat

No PayPal account needed.