← All posts

Anecdote and Data

Morgan Voss·

Your cat refuses to eat a particular brand of food. She approaches the bowl, sniffs it once, and walks away. This is real information about one cat. It is a sample of size one from a distribution you don't have access to.

It does not establish that cats generally refuse this food. It does not establish that the food is bad. It establishes that one data point exists, and you have no way to know from this alone whether it is representative or an outlier. This is the problem with anecdotes: not that they are false, but that they are insufficient on their own, and the degree of insufficiency depends on context.

A Sample of One

A sample of size one has a mean: the observation itself. It does not have a variance. With a single data point, there is no basis for estimating spread, and therefore no basis for constructing a confidence interval. The interval is, formally, infinite.

xˉ=x1,s2=1n1i=1n(xixˉ)2 is undefined for n=1\bar{x} = x_1, \qquad s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2 \text{ is undefined for } n = 1

The estimate exists. Its precision is entirely unknown. You know the cat rejected this food today. You don't know whether she rejects it 90% of the time or whether today was an anomaly.

This does not make the observation worthless. It makes it underspecified. Whether it is useful depends on what else you know and what question you are trying to answer.

How the prior changes the calculation

At n=1n = 1, Bayesian updating gives us the clearest account of what the observation is worth. Bayes' theorem tells us that the posterior probability is proportional to the likelihood of the observation times the prior:

P(θx)P(xθ)P(θ)P(\theta \mid x) \propto P(x \mid \theta) \cdot P(\theta)

When the prior P(θ)P(\theta) is strong and concentrated, a single contrary observation barely moves it. When the prior is weak and spread across many possibilities, a single observation can shift the posterior substantially.

If you already know from manufacturer data that this brand is accepted by 95% of cats in trials, one rejection is mildly informative. The posterior probability that your cat is a typical rejector shifts slightly upward from the base rate, but the single observation doesn't overturn the prior. If you have no prior at all about this food's acceptance rate, the single rejection is the entirety of your evidence, and it anchors the posterior heavily.

This is why dismissing anecdotes entirely is also a mistake. "That's just one person's experience" is accurate about the size of the sample. It is not accurate about the information content, which depends on the prior. In domains where base rates are unknown and formal studies don't exist, a sample of one may be all there is.

When Anecdotes Are Genuinely Informative

Several conditions push the informational value of a single observation upward.

If the event is rare and there is no better data, one observation of an unusual outcome is meaningful evidence. A single documented case of a medication interaction that was thought impossible changes the probability that the interaction is possible at all, not from zero to certainty, but from zero to something nonzero. That's a significant update.

Source matters too. A veterinary nutritionist who reports that a cat refused a food after an initial period of normal acceptance is providing a more informative signal than the same observation from someone who has owned one cat for six months. The expert has calibrated priors. They know what baseline variation looks like and can tell you whether the rejection is unusual.

The harder case is an anecdote that contradicts a strong consensus. One contrary observation raises the prior that the consensus might be wrong. It does not establish that it is. The appropriate response is further investigation, not adoption and not dismissal.

The Mistake in Both Directions

Overclaiming from anecdotes is the more frequently discussed error. A personal experience of dramatic weight loss on a particular diet does not establish that the diet works at the population level. The anecdote may be genuine and the general claim may still be false. Selection effects alone can produce a pool of enthusiastic positive anecdotes for any intervention: the people who had no effect, or a negative one, are not the ones writing testimonials.

But the opposite error is also real. Dismissing anecdotal evidence because it is not from a randomized controlled trial ignores the information-theoretic position of the anecdote. It is evidence. It may be weak evidence. It may be non-representative. But the prior should move, even if only slightly, when a credible person reports a clear observation.

The distinction that matters is between "not worthless" and "sufficient." An anecdote can be not worthless, can genuinely shift your probability estimates, without being sufficient evidence for a strong claim. These are different thresholds, and conflating them produces both overclaiming and reflexive dismissal.

What to Do With One Data Point

The practical posture is to treat the anecdote as a weak update and ask what it would take to strengthen it. One cat refusing one food is a data point. Ten cats from different households refusing the same food is a pattern worth taking seriously. A peer-reviewed feeding trial is a different category of evidence altogether.

The sample of one tells you that the outcome is possible and has occurred at least once. That is real information. It is not sufficient to act on if the action is consequential and better evidence is obtainable. It may be sufficient to motivate further inquiry.

Your cat eventually got a different brand. Whether that was the right statistical decision is unclear. She ate it, which is a second data point, and the prior updated accordingly.

The cats don't charge. The site doesn't either. If something here helped a concept click, a small tip is appreciated.

Buy the Cats a Treat

No PayPal account needed.