← All posts

The Hot Hand

Morgan Voss·

Your cat successfully leaps to a high shelf three times in a row. It is tempting to say she is in the zone. Cats do have good days; coordination and confidence vary. But random sequences produce streaks. Flipping a fair coin long enough will give you runs of five or six heads, and the coin has no zone to be in.

The question is not whether a streak occurred. The question is whether the success rate during a streak is actually higher than the baseline, or whether the streak is exactly what the baseline probability would predict.

The Gilovich, Vallone, and Tversky Study

In 1985, Thomas Gilovich, Robert Vallone, and Amos Tversky published a study asking whether basketball players shoot better following made shots than following misses. The hot hand belief was universal among players, coaches, and fans. The data, they argued, did not support it.

They analyzed sequences of free throw attempts and field goal attempts from NBA players. If the hot hand is real, a player's probability of making a shot should be measurably higher when the previous shot was made. What they found was that conditional probabilities showed no consistent pattern. The probability of a made shot following a made shot was not reliably higher than the probability of a made shot following a miss.

This was a striking result, and an uncomfortable one. It suggested that human pattern recognition was systematically detecting a signal that was not there.

What Random Sequences Actually Look Like

The intuition behind hot hand belief is partly that long streaks feel unlikely. They feel like they require an explanation beyond chance. This intuition is wrong in a specific, calculable way.

For a sequence of nn independent Bernoulli trials with success probability pp, the expected length of the longest run of successes grows logarithmically with nn. For 100 coin flips with p=0.5p = 0.5, the expected length of the longest run of heads is approximately $\log_2(100) \approx 6.6$. A run of six or seven heads in 100 flips is not extraordinary. It is what you should expect.

More precisely, the expected number of runs of length kk or more in nn trials is:

E[runs of lengthk](nk+1)(1p)pkE[\text{runs of length} \geq k] \approx (n - k + 1)(1 - p) \cdot p^k

At n=100n = 100, p=0.5p = 0.5, k=6k = 6: this gives roughly 1.5 expected runs of six or more. They are going to happen. Calling one of them a hot hand is naming ordinary variation.

The Sampling Bias Correction

The Gilovich et al. result held for decades until Joshua Miller and Adam Sanjurjo identified a subtle but significant problem in 2018.

The standard approach to testing the hot hand is to look at sequences like HHTHHTH and ask: among shots following two made shots (HH), what fraction are also made? The intuition is that this estimates the conditional probability P(hitprevious two hits)P(\text{hit} \mid \text{previous two hits}).

It does not. There is a sampling bias. When you condition on a streak of hits within a finite sequence and then look at the next shot, you are systematically more likely to be looking at a shot that ended the streak. The streak-ending shot is overrepresented relative to a random draw from all shots following a streak.

This bias causes the naive estimator to understate the hot hand effect. Miller and Sanjurjo showed that when the analysis is corrected for this bias, the original Gilovich et al. data actually contains statistically significant evidence for the hot hand in some contexts.

The claim is not that the hot hand definitely exists or is large. It is that the original analysis was subtly flawed, and the corrected analysis is less conclusive in the negative direction than was thought.

Why Pattern Recognition Overshoots

Even granting some hot hand effect, the psychological point stands. Human pattern recognition reliably over-detects streaks. A sequence that looks clustered to an observer will often be statistically indistinguishable from a random sequence with the same base rate.

Part of this is the availability of narrative. A made shot following two made shots is memorable in a way that a made shot following a miss is not. The streak gets noticed; the regression to base rate does not. Over time, this selective attention builds a mental model that overweights streak evidence.

The same process governs the interpretation of the cat's shelf-jumping streak. Three successes in a row is consistent with a cat whose baseline success rate is 70%, and also with one whose rate is 50%. A sample of three tells you very little about which it is. The streak is real. The inference from the streak to elevated ability requires much more data than the streak itself provides.

What the research shows

The hot hand question is not fully resolved. The evidence for a small hot hand effect in some settings is credible. The psychological tendency to read streaks as evidence of dramatically elevated ability is not.

What the research establishes clearly is that the intuitive standard for what a random sequence looks like is calibrated wrong. Streaks feel like they need explaining. They often don't. Before invoking a mechanism, it is worth asking whether the streak falls within the range of what would be expected from the known base rate. Usually, it does.

The cats don't charge. The site doesn't either. If something here helped a concept click, a small tip is appreciated.

Buy the Cats a Treat

No PayPal account needed.