"Correlation does not imply causation" is one of the most important principles in statistics. Yet it is violated constantly in news articles, social media, and even published research. Understanding this distinction can save you from costly mistakes in business decisions, policy-making, and scientific conclusions.

What is Correlation?

Two variables are correlated if they tend to change together — when one goes up, the other goes up (positive correlation) or down (negative correlation). Measured by the Pearson correlation coefficient r ∈ [−1, +1].

What is Causation?

A causes B if changing A directly produces a change in B — not just that they happen to vary together. The direction matters (A → B), the timing matters (A must precede B), and alternative explanations must be ruled out.

Famous Spurious Correlations

These real correlations illustrate why correlation alone proves nothing:

Why Variables Can Be Correlated Without Causation

1. Confounding Variable (Common Cause)

A third variable C causes both A and B. Example: Physical fitness level (C) causes both lower resting heart rate (A) and longer lifespan (B). Heart rate and lifespan are correlated, but one does not cause the other.

2. Reverse Causation

You think A causes B, but actually B causes A. Example: Depression and social isolation are correlated. Does isolation cause depression, or does depression cause isolation? Often both.

3. Coincidental Correlation

Pure chance, especially in small datasets or when you search through many pairs. With 100 variables, you expect about 5 spurious significant correlations at α = 0.05 by chance alone.

How to Establish Causation

The gold standard is a randomised controlled experiment (RCT):

When experiments are impossible (ethics, cost, scale), researchers use:

Practical Implications

Before acting on a correlation:

Use our Pearson Correlation Calculator to measure correlation strength, and always pair it with critical thinking about causation.