What Is a P-Value? Plain English Explanation with Examples

The p-value is one of the most used — and most misunderstood — concepts in all of statistics. Every research paper, clinical trial, and data science project uses it. Yet most people who use p-values daily cannot accurately define what they mean. This guide explains p-values clearly, with no confusing jargon.

The Simple Definition

A p-value is the probability of getting results at least as extreme as your observed data, assuming the null hypothesis is true.

In plain English: "If there really were no effect, how likely is it that I would see data this unusual just by chance?"

A small p-value (close to 0) means: your data would be very unlikely if the null hypothesis were true. So maybe the null hypothesis is wrong.

A Simple Example

Imagine you flip a coin 20 times and get 16 heads. You want to test if the coin is fair (p = 0.5).

Null hypothesis H₀: The coin is fair (p = 0.5)

Question: If the coin really is fair, how likely is it to get 16 or more heads out of 20?

Answer: P(X ≥ 16) = about 0.006 = 0.6%

This is your p-value. Since 0.006 < 0.05, you reject H₀ — the coin is probably not fair. Getting 16 heads by chance when the coin is fair would happen only 6 times in 1,000 experiments.

What Does p < 0.05 Mean?

The threshold α = 0.05 is a convention proposed by Ronald Fisher in the 1920s. It means:

You accept a 5% chance of falsely concluding there is an effect when there is none (Type I error)
If you ran 100 experiments where H₀ is true, about 5 would give p < 0.05 by chance alone
p < 0.05 does NOT mean there is a 95% probability your conclusion is correct

P-value range	Common interpretation	Decision (α = 0.05)
p < 0.001	Extremely significant	Reject H₀ strongly
0.001 ≤ p < 0.01	Highly significant	Reject H₀
0.01 ≤ p < 0.05	Statistically significant	Reject H₀
0.05 ≤ p < 0.10	Borderline / marginal	Fail to reject H₀ (with caution)
p ≥ 0.10	Not significant	Fail to reject H₀

5 Common P-Value Misconceptions

Misconception 1: p = 0.03 means there is a 97% chance the result is real

Wrong. The p-value says nothing about the probability that your conclusion is correct. It only says how unusual your data would be if H₀ were true. The probability that H₀ is true given your data (the posterior probability) requires Bayesian analysis.

Misconception 2: p > 0.05 means the null hypothesis is true

Wrong. Failing to reject H₀ just means you do not have enough evidence against it. Your study might be underpowered (too small a sample) to detect a real but small effect. "Absence of evidence is not evidence of absence."

Misconception 3: A smaller p-value means a bigger effect

Wrong. P-value depends on both effect size AND sample size. A very large study can produce p < 0.0001 for a completely trivial effect. Always report effect sizes (Cohen's d, R², odds ratio) alongside p-values.

Misconception 4: p < 0.05 means the finding is practically important

Wrong. Statistical significance ≠ practical significance. A drug that reduces blood pressure by 0.5 mmHg might be statistically significant in a massive trial but completely clinically irrelevant.

Misconception 5: You can "accept" H₀ if p > 0.05

Wrong. You either reject H₀ or fail to reject it. You never accept H₀. The hypothesis test is a one-way test — significant results are meaningful, non-significant results are inconclusive.

P-Values in Different Statistical Tests

Every statistical test produces a p-value. The test statistic varies but the interpretation is the same:

T-test: t-statistic → p-value from t-distribution
ANOVA: F-statistic → p-value from F-distribution
Chi-square test: χ²-statistic → p-value from chi-square distribution
Regression: F-statistic (overall) or t-statistic (each coefficient)
Correlation: t-statistic → p-value testing if r = 0

The Replication Crisis and P-Values

Many published findings with p < 0.05 have failed to replicate. Reasons include: p-hacking (testing many hypotheses until one is significant), publication bias (only publishing significant results), and inadequate sample sizes.

Modern best practices go beyond the p-value: report confidence intervals, effect sizes, and consider pre-registration of hypotheses to prevent data dredging.

How to Calculate P-Values

Use our free calculators to get exact p-values instantly:

T-Test Calculator — for testing means
Chi-Square Test Calculator — for categorical data
ANOVA Calculator — for comparing multiple groups

The Formal Definition of P-Value

The p-value is the probability of observing a test statistic at least as extreme as the one calculated from your sample data, assuming the null hypothesis is true. This definition is dense, so let us unpack it carefully. The null hypothesis (H₀) is your default assumption — for instance, that a drug has no effect, or that two groups have equal means.

When you calculate a p-value of 0.03, it means: if the null hypothesis were actually true, there would be only a 3% probability of getting results as extreme as yours by random chance alone. This is considered unlikely enough to cast doubt on the null hypothesis.

How P-Values Are Calculated

P-values come from comparing your test statistic to a theoretical probability distribution. Different tests use different distributions. A t-test uses the t-distribution, ANOVA uses the F-distribution, and chi-square tests use the chi-square distribution. Each distribution assigns probabilities to ranges of test statistic values.

For a two-tailed test, the p-value covers both extreme tails of the distribution — values more extreme in either direction. For a one-tailed test, only one tail is considered. Two-tailed tests are more conservative and are the default choice unless you have a strong directional hypothesis.

The Significance Threshold: Why 0.05?

The conventional threshold of α = 0.05 (5%) was introduced by Ronald Fisher in the 1920s as a rough guideline, not a sacred law. Fisher himself noted that researchers should use their own judgment. Yet the 0.05 threshold became entrenched in scientific publishing, creating an artificial boundary between "significant" and "not significant" results.

Many fields now use stricter thresholds: particle physics requires p < 0.000001 (5 sigma) before claiming a discovery; medical research often requires p < 0.01 for drug approval. The American Statistical Association has emphasised that p-values alone should not determine scientific conclusions.

What P-Values Cannot Tell You

P-values are widely misunderstood. Here is what a p-value does NOT tell you:

It does not tell you the probability that the null hypothesis is true
It does not tell you the probability that your results occurred by chance
It does not measure the size or practical importance of an effect
It does not tell you whether your study will replicate
It does not account for whether your study was well designed

A common misinterpretation is "p = 0.04 means there is a 4% chance the null hypothesis is true." This is incorrect. The p-value is calculated assuming H₀ is true; it says nothing about the probability of H₀ itself.

P-Values and Sample Size: A Critical Relationship

One of the most important but underappreciated facts about p-values is their relationship with sample size. With a very large sample, even tiny, practically meaningless differences become statistically significant. With a small sample, even large, important differences may not reach significance.

For example, an online retailer with 1 million users might find that button colour A produces 0.001% more clicks than button colour B, with p < 0.001. The difference is statistically significant but economically trivial. Conversely, a clinical trial with 20 patients might fail to detect a genuine treatment effect simply because the study was underpowered.

Multiple Testing Problem

When you conduct multiple hypothesis tests simultaneously, the chance of getting at least one false positive increases dramatically. If you test 20 independent hypotheses each at α = 0.05, you would expect 1 false positive on average even if all null hypotheses are true. This is why large-scale studies (genetic association studies, brain imaging research) apply corrections like Bonferroni correction (divide α by number of tests) or false discovery rate (FDR) control.

Alternatives and Complements to P-Values

The scientific community increasingly recommends reporting confidence intervals alongside p-values. A 95% confidence interval shows the range of plausible values for the true parameter, conveying both statistical significance and practical meaningfulness. An effect size measure (Cohen's d, eta-squared, odds ratio) shows the magnitude of the effect independently of sample size.

Bayesian methods offer an alternative framework using Bayes factors, which directly compare the evidence for and against hypotheses. Unlike p-values, Bayes factors can provide evidence for the null hypothesis, which frequentist p-values cannot.

The Formal Definition of P-Value

How P-Values Are Calculated

The Significance Threshold: Why 0.05?

What P-Values Cannot Tell You

P-values are widely misunderstood. Here is what a p-value does NOT tell you:

It does not tell you the probability that the null hypothesis is true
It does not tell you the probability that your results occurred by chance
It does not measure the size or practical importance of an effect
It does not tell you whether your study will replicate
It does not account for whether your study was well designed

P-Values and Sample Size: A Critical Relationship

Multiple Testing Problem

Alternatives and Complements to P-Values

Complete Worked Example: Calculating a P-Value by Hand

A coin is suspected to be biased. You flip it 20 times and get 15 heads. What is the p-value for testing H₀: p = 0.5 (fair coin) vs H₁: p ≠ 0.5 (two-tailed)?

Under H₀, the number of heads X ~ Binomial(20, 0.5). The p-value is the probability of getting results as extreme as 15 or more extreme in either direction. P(X ≥ 15) = P(X=15) + P(X=16) + ... + P(X=20). By symmetry, P(X ≤ 5) = P(X ≥ 15). p-value = 2 × P(X ≥ 15) = 2 × [C(20,15)×0.5²⁰ + ... + C(20,20)×0.5²⁰] = 2 × (15504 + 4845 + 1140 + 190 + 20 + 1)/1048576 = 2 × 21700/1048576 ≈ 0.0414.

Since p = 0.041 < 0.05, reject H₀ at the 5% significance level. There is statistically significant evidence the coin is biased. However, note the 95% CI for p: [0.509, 0.908] — the bias could be modest or large, and the sample is small. The p-value says "significant"; the CI clarifies "we are uncertain how biased."

P-Value Interpretation: Five Common Misconceptions Corrected

Statistical education research consistently finds that most people — including researchers — misinterpret p-values. Here are the five most dangerous misconceptions with corrections:

Misconception 1: "p = 0.04 means there is a 4% probability the null hypothesis is true." Correction: The p-value is calculated assuming H₀ is true; it cannot give the probability that H₀ is true (that requires Bayesian analysis with a prior).

Misconception 2: "p = 0.06 means the result is almost significant." Correction: There is no continuum of "almost significant." Either the pre-specified threshold is crossed or it is not. "Trending toward significance" is not a statistical concept.

Misconception 3: "A small p-value means the effect is large." Correction: p-values conflate effect size with sample size. A tiny effect can produce p < 0.001 with a large sample. Always report effect sizes.

Misconception 4: "If p > 0.05, the null hypothesis is true." Correction: Failing to reject H₀ ≠ evidence for H₀. The study may simply be underpowered.

Misconception 5: "p = 0.049 and p = 0.051 are meaningfully different." Correction: The threshold is arbitrary. Both provide similar (weak) evidence against H₀. The evidence is nearly identical; the decision flip is an artefact of the threshold convention.

Calculate Instantly — 100% Free

45 statistics calculators with step-by-step solutions, interactive charts, and PDF export. No sign-up needed.

▶ Open Free Statistics Calculator

🔗 Related Resources

Statistical Conc T-Test Calculator → Statistical Conc How to Calculate P-Value → Statistical Conc ANOVA Calculator → All Articles Browse All Statistics Articles →

What is a P-Value? Explained Simply

The Simple Definition

A Simple Example

What Does p < 0.05 Mean?

5 Common P-Value Misconceptions

Misconception 1: p = 0.03 means there is a 97% chance the result is real

Misconception 2: p > 0.05 means the null hypothesis is true

Misconception 3: A smaller p-value means a bigger effect

Misconception 4: p < 0.05 means the finding is practically important

Misconception 5: You can "accept" H₀ if p > 0.05

P-Values in Different Statistical Tests

The Replication Crisis and P-Values

How to Calculate P-Values

The Formal Definition of P-Value

How P-Values Are Calculated

The Significance Threshold: Why 0.05?

What P-Values Cannot Tell You

P-Values and Sample Size: A Critical Relationship

Multiple Testing Problem

Alternatives and Complements to P-Values

The Formal Definition of P-Value

How P-Values Are Calculated

The Significance Threshold: Why 0.05?

What P-Values Cannot Tell You

P-Values and Sample Size: A Critical Relationship

Multiple Testing Problem

Alternatives and Complements to P-Values

Complete Worked Example: Calculating a P-Value by Hand

P-Value Interpretation: Five Common Misconceptions Corrected

Calculate Instantly — 100% Free

Deep Dive: What Is P Value Explained — Theory, Assumptions, and Best Practices

Mathematical Foundation

Assumptions and Diagnostics

Interpreting Your Results Completely

Effect Size and Practical Significance

Common Errors and How to Avoid Them

When This Test Is Not Appropriate

Reporting in Academic and Professional Contexts