Tagged: undergraduate
← All topicsArticles tagged with undergraduate in Stats and Cats.
- Anecdote and Data
What personal experience can and cannot tell you. Anecdote is not worthless: it is a sample of one from an unknown distribution. The weight it deserves depends on what else is known.
- Florence Nightingale's Rose Diagrams
How Nightingale used polar area charts to make mortality data legible to people who would not read a table, and why the design choices were deliberate and effective.
- How Charts Mislead Without Lying
Truncated axes, dual y-axes, cherry-picked time windows, and area encoding errors. What to look for before trusting a chart.
- Law of Large Numbers vs. the Gambler's Fallacy
The law of large numbers is a theorem. The gambler's fallacy is a mistake. They sound related and are easy to confuse. They say opposite things.
- The Multiple Comparisons Problem
Run enough tests at a 0.05 threshold and something will look significant by chance. What the family-wise error rate means, why it matters, and what to do about it.
- Simpson's Paradox: When Subgroups Disagree With the Aggregate
How a trend that holds within every subgroup can reverse when those groups are combined, and why the Berkeley admissions data remains the clearest illustration.
- Statistical Power: Why Small Studies Often Find Nothing
Power is the probability of detecting an effect that actually exists. A study that finds nothing may simply have been too small to find anything. Here's what determines power and why it matters before collecting data.
- Survivorship Bias: The Sample You're Not Seeing
When you study only the outcomes that made it through a filter, you are not studying outcomes. You are studying a selection process. The WWII plane problem and why it matters wherever data gets filtered.
- The Birthday Problem: Why 23 People Is Enough
In a room of 23 people, the probability that two share a birthday exceeds 50%. The math is clean; the intuition resists it. Here's what's actually being counted.
- The Hot Hand
Streaks in basketball shooting data and whether they reflect genuine elevated performance or expected clustering in random sequences. A case study in what random actually looks like.
- The Monty Hall Problem: Why You Should Always Switch
The conditional probability problem that has produced more confident wrong answers than almost any other. The correct answer is 2/3, and the host's knowledge is why.
- Variance and Standard Deviation: Why Spread Matters
Two distributions with identical means can behave entirely differently. Variance and standard deviation measure why, and understanding the mechanics behind them reveals what they actually capture.
- What a p-value Actually Measures
A p-value is not the probability the null hypothesis is true, not a measure of effect size, and not a verdict on whether a finding is real. Here is what it is.
- The Binomial Distribution: Counting Successes in Fixed Trials
How the binomial distribution models the number of successes in a fixed number of independent trials, and why the formula looks the way it does.
- The Poisson Distribution: Modeling Rare Events at a Known Rate
The Poisson distribution models counts of independent events occurring at a constant rate. One parameter does everything, and that turns out to be enough.
- The Central Limit Theorem: Why Averages Behave
Individual observations can follow nearly any distribution. Average enough of them together, and the result converges toward normal. Here's why that happens and why it matters.
- Regression to the Mean: Why Exceptional Performance Doesn't Last
Extreme outcomes tend to be followed by more ordinary ones. This is not a psychological phenomenon. It is a mathematical one, with real implications for how we evaluate causes and interventions.
- Type I and Type II Errors: The Trade-Off You Can't Avoid
False positives and false negatives are not both minimizable at once. The threshold that reduces one will increase the other. Where it gets set is a choice, and it matters.
- What Confidence Intervals Actually Tell You
A 95% confidence interval does not mean a 95% probability that the true value is inside it. Here is what the statement actually means, and why the distinction is worth getting right.
- Discrete vs Continuous Distributions: PMF, PDF, and CDF
The difference between discrete and continuous probability distributions, explained through the PMF, PDF, and CDF, with cat examples that are doing actual work.
