← All posts

How Charts Mislead Without Lying

Morgan Voss·

A chart of your cat's weight over two months shows a dramatic upward slope. The line climbs steeply from one side of the panel to the other. It looks like a crisis. The y-axis runs from 4.1 kg to 4.4 kg. The actual change is 300 grams over two months. Noticeable, maybe worth a conversation with a vet, but not the trajectory the chart implies.

The data is accurate. The chart is not lying. It is, however, doing real work to direct your interpretation, and that work is not neutral.

The Truncated Axis

A chart's visual impression of slope is proportional to the range shown on the vertical axis, not to the actual range of the data. Compress that range and any change looks dramatic. Expand it and the same data looks flat.

The weight chart above spans 0.3 kg of variation. If the axis started at zero, the cat's weight curve would barely deviate from horizontal. Starting at 4.1 kg and ending at 4.4 kg makes the range appear to be the entire story, when it is roughly 7% of the cat's actual weight.

This is not always dishonest. If the meaningful variation in a phenomenon is genuinely small relative to zero, forcing the axis to include zero destroys the signal. Stock prices, body temperatures, and manufacturing tolerances all have this property. A chart of human body temperature starting at 0°C would be useless.

The question to ask is not "does this axis start at zero" but "what is the natural baseline for this variable, and does the axis choice obscure meaningful context." Those are different questions, and the answer depends on what's being measured.

Dual Y-Axes

Any two trends can be made to look correlated by adjusting their axes independently. This is the dual y-axis problem.

Take two entirely unrelated time series. Plot them on the same chart with separate vertical scales. Adjust each scale until the lines move together visually. The chart will appear to show a strong relationship. There is no statistical content here. The appearance of correlation was designed in.

Dual axes are not always deceptive. When the two series are causally related and the reader knows this, plotting them together at different scales can be legitimate. The problem is that the chart provides no information about whether the scale alignment is meaningful or chosen for effect. Two trends that happen to share a time axis get a free correlation upgrade.

The check is simple: ask whether the scales are independently adjustable, and if so, whether the chosen alignment has a justification beyond visual appearance.

Cherry-Picked Time Windows

The trend visible in a chart depends heavily on where it starts and ends. A company's stock price may look like steady growth over 18 months and like a collapsing failure over three years, depending on the window chosen.

This is most common in advocacy contexts. A policy effect can be made to look dramatic by starting the chart at exactly the right moment. An inconvenient trend can be made to look like noise by zooming out far enough. The data at every point in the chart is accurate. The selection of which data to show is where the argument lives.

The practical defense is to ask what came before and after. If the time window begins at a convenient inflection point, that inflection point needs explanation. It should not just be where the story starts.

Area Encoding

Bubble charts encode a quantity as the area of a circle. This is appropriate. The problem arises when the radius, rather than the area, is made proportional to the value.

The area of a circle is A=πr2A = \pi r^2. If country A has a GDP twice that of country B, the correct representation gives country A a circle with 2\sqrt{2} times the radius, so that its area is twice as large. If instead the radius is doubled, the area becomes four times as large. The visual difference between the two countries is exaggerated by a factor of two.

AwrongAcorrect=π(2r)2π(2r)2=4πr22πr2=2\frac{A_{\text{wrong}}}{A_{\text{correct}}} = \frac{\pi (2r)^2}{\pi (\sqrt{2} \, r)^2} = \frac{4\pi r^2}{2\pi r^2} = 2

This error is common enough to be a known failure mode of data visualization tools. The reader perceives area; the chart encodes radius. The result systematically overstates differences between large and small values.

When reading a bubble chart, it is worth asking whether the documentation states what is encoded in the size. Area and radius differ by a factor of rr, and the choice is not always disclosed.

Reading charts critically

None of these techniques require malice to appear. Default chart settings produce truncated axes routinely. Dual axes get used because they are available. Time windows get chosen because the analyst started with the data she had. The result in each case is a chart that communicates something its creator may not have explicitly intended.

The practical posture is skepticism about scale choices, not about charts in general. Before reading the trend, read the axes. Note the range. Ask whether a different range would change the visual impression and, if so, whether the chosen range is justified by the data's natural scale. For bubble charts, find the legend. For dual axes, ask whether the scales were set before or after the lines started matching.

The chart is not lying. But it has opinions, and they are worth identifying before you adopt them.

The cats don't charge. The site doesn't either. If something here helped a concept click, a small tip is appreciated.

Buy the Cats a Treat

No PayPal account needed.