Type I & Type II Errors in Statistics — Examples & How to Avoid

Every time you run a hypothesis test, there is a possibility of making one of two types of errors. Understanding Type I and Type II errors is critical for designing studies, interpreting results, and making good decisions with data.

The Decision Matrix

	H₀ is Actually TRUE	H₀ is Actually FALSE
Reject H₀	❌ Type I Error (α) — False Positive	✅ Correct Decision — True Positive (Power = 1−β)
Fail to Reject H₀	✅ Correct Decision — True Negative	❌ Type II Error (β) — False Negative

Type I Error (α) — The False Positive

A Type I error occurs when you reject H₀ even though it is actually true. You conclude there is an effect when there really is none. The probability of a Type I error equals your significance level α.

Example: A drug company tests a new medicine. H₀: the drug has no effect. In reality, the drug does nothing. But due to random chance, the clinical trial produces p = 0.03 < 0.05. The company concludes the drug works — this is a Type I error. They have a false positive.

Setting α = 0.05 means you accept a 5% chance of making this mistake. In fields where false positives are catastrophic (nuclear safety, drug approval), α = 0.01 or 0.001 is used.

Type II Error (β) — The False Negative

A Type II error occurs when you fail to reject H₀ even though it is false. You miss a real effect — you conclude there is nothing there when there actually is. The probability of a Type II error is β.

Example: A new teaching method genuinely improves test scores. H₀: no improvement. But your study only had 15 students — too small to detect the effect. You get p = 0.12 and fail to reject H₀. You miss the real improvement. This is a Type II error.

Statistical Power = 1 − β

Power is the probability of correctly detecting a real effect. Power = 1 − β. If β = 0.20, power = 0.80 (80% chance of detecting the effect if it exists). Power ≥ 0.80 is the standard target in research.

Power increases with:

Larger sample size (most important factor)
Larger true effect size
Higher significance level α (but this increases Type I error)
Lower variability in the data
One-tailed test instead of two-tailed (if justified)

The Trade-off Between Type I and Type II Errors

You cannot minimise both simultaneously (for a fixed n). Reducing α (stricter test) reduces Type I errors but increases Type II errors. The only way to reduce both is to increase sample size.

The relative severity of each error type should guide your choice of α:

Type I error is worse: Use smaller α (0.01). Example: approving a harmful drug is worse than rejecting a beneficial one.
Type II error is worse: Use larger α (0.10). Example: failing to detect a disease outbreak is worse than a false alarm.

Use our Sample Size Calculator to plan studies with adequate power to control both error types.

The Decision Matrix

Every hypothesis test produces one of four outcomes, summarised in a 2×2 decision matrix. Two are correct decisions: correctly failing to reject a true null hypothesis (true negative), and correctly rejecting a false null hypothesis (true positive). Two are errors: Type I error (false positive) — rejecting a true null hypothesis, and Type II error (false negative) — failing to reject a false null hypothesis.

Type I Error: False Positive

A Type I error occurs when you conclude there is an effect when in reality there is none. The probability of a Type I error is exactly α (significance level) when H₀ is true. By setting α = 0.05, you accept a 5% chance of incorrectly rejecting the null hypothesis. This is the researcher's choice before conducting the test — it represents the tolerable false positive rate.

Real consequences: approving an ineffective drug (medical trials), implementing a marketing campaign that does not actually increase sales (business), publishing a false scientific finding (research). False positives waste resources on ineffective interventions.

Type II Error: False Negative

A Type II error occurs when you fail to detect a real effect. The probability of a Type II error is β, and statistical power = 1 − β is the probability of correctly detecting a real effect. Power depends on: sample size (larger n → higher power), effect size (larger effects are easier to detect), significance level (higher α → higher power, but more Type I errors), and variability (lower variance → higher power).

Real consequences: failing to detect an effective treatment (medical trials), missing a real market opportunity (business), failing to replicate a genuine scientific finding. False negatives mean real effects go undetected.

The Error Tradeoff

Type I and Type II errors trade off — decreasing one increases the other for a fixed sample size. The only way to reduce both simultaneously is to increase sample size. The socially acceptable balance depends on context. In drug safety testing, Type I errors (approving harmful drugs) may be more costly than Type II errors (missing some beneficial drugs), justifying strict α = 0.01. In exploratory research, Type II errors may be more costly, justifying more liberal α = 0.10.

Power Analysis Before the Study

Power analysis is a critical pre-study calculation. Researchers specify desired power (typically 0.80 or 0.90), significance level (typically 0.05), and minimum clinically meaningful effect size. The calculation then determines required sample size. Underpowered studies — regrettably common — frequently produce false negatives and cannot reliably replicate. The replication crisis in psychology and medicine is partly attributable to widespread use of underpowered studies.

Multiple Testing and Error Rate Control

When conducting many tests simultaneously, the family-wise error rate (FWER) — probability of at least one Type I error — increases rapidly. With 20 independent tests at α = 0.05, the FWER is 1 − 0.95²⁰ ≈ 64%. Bonferroni correction divides α by the number of tests, controlling FWER at α. The Benjamini-Hochberg procedure controls the false discovery rate (FDR) — expected proportion of false positives among rejected hypotheses — and is less conservative.

The Decision Matrix

Type I Error: False Positive

Type II Error: False Negative

The Error Tradeoff

Power Analysis Before the Study

Multiple Testing and Error Rate Control

Detailed Worked Example: Clinical Drug Trial

A pharmaceutical company is testing a new blood pressure medication. Their trial has 200 participants randomised to drug vs placebo. They set α = 0.05 and aim for 80% power to detect a 5 mmHg reduction (the minimum clinically meaningful effect).

Scenario A — Type I Error: The drug actually has no effect on blood pressure. But by chance, the treatment group happened to have lower readings (random sampling variation). The t-test gives p = 0.03. The company rejects H₀ and concludes the drug works. This is a Type I error — a false positive. Consequence: an ineffective drug gets marketed, patients take it believing it helps, and money is wasted.

Scenario B — Type II Error: The drug genuinely reduces blood pressure by 5 mmHg. But the sample was small and variable. The t-test gives p = 0.12. The company fails to reject H₀ and concludes there is no evidence the drug works. This is a Type II error — a false negative. Consequence: an effective drug never reaches patients who need it.

The trial was designed with 80% power, meaning if the true effect is 5 mmHg, there is a 20% probability of a Type II error. Increasing n to 300 would raise power to 90%, reducing the Type II error rate to 10%.

Real Example: COVID-19 Rapid Tests

Rapid antigen tests for COVID-19 illustrate the practical consequences of both error types. A test with 85% sensitivity means 15% of truly positive cases are missed (Type II error rate = 15%). These false negatives go home believing they are not infectious and potentially spread disease. A test with 99% specificity means 1% of truly negative cases test positive (Type I error rate = 1%). These false positives self-isolate unnecessarily.

During a high-prevalence outbreak, the priority is minimising false negatives (maximising sensitivity). During low-prevalence screening of healthcare workers, false positives become more costly as many healthy workers would be incorrectly excluded. The optimal balance between Type I and Type II errors changes with context — a fundamental insight that applies across medicine, manufacturing quality control, cybersecurity (intrusion detection), and judicial systems.

Calculate Instantly — 100% Free

45 statistics calculators with step-by-step solutions, interactive charts, and PDF export. No sign-up needed.

▶ Open Free Statistics Calculator

🔗 Related Resources

Statistical Meth Sample Size Calculator → Statistical Meth T-Test Calculator → Statistical Meth Hypothesis Testing Guide → All Articles Browse All Statistics Articles →

Type I and Type II Errors in Statistics

The Decision Matrix

Type I Error (α) — The False Positive

Type II Error (β) — The False Negative

Statistical Power = 1 − β

The Trade-off Between Type I and Type II Errors

The Decision Matrix

Type I Error: False Positive

Type II Error: False Negative

The Error Tradeoff

Power Analysis Before the Study

Multiple Testing and Error Rate Control

The Decision Matrix

Type I Error: False Positive

Type II Error: False Negative

The Error Tradeoff

Power Analysis Before the Study

Multiple Testing and Error Rate Control

Detailed Worked Example: Clinical Drug Trial

Real Example: COVID-19 Rapid Tests

Calculate Instantly — 100% Free

Deep Dive: Type 1 Type 2 Errors Statistics — Theory, Assumptions, and Best Practices

Mathematical Foundation

Assumptions and Diagnostics

Interpreting Your Results Completely

Effect Size and Practical Significance

Common Errors and How to Avoid Them

When This Test Is Not Appropriate

Reporting in Academic and Professional Contexts