Result
Enter parameters and click Calculate
Compute p-values for Z, T, Chi-square, or F test statistics.
No recently used tools
Loading categories...
Free online p-value calculator for Z-test, T-test, Chi-square, and F-test. One-tailed and two-tailed tests with significance classification, interactive distribution curve, and Python scipy export.
Compute p-values for Z, T, Chi-square, or F test statistics.
A p-value is the probability of observing a test statistic at least as extreme as the one computed from sample data, assuming the null hypothesis (H₀) is true. It is a fundamental concept in hypothesis testing and helps researchers decide whether to reject H₀.
The p-value quantifies how likely observed data (or more extreme) would occur if H₀ were true. Ranges from 0 to 1.
A smaller p-value provides stronger evidence against the null hypothesis. It does NOT measure the probability that H₀ is true.
Compare p-value to significance level α. If p ≤ α, reject H₀. Common thresholds: 0.05, 0.01, 0.10.
The significance of a p-value depends on the chosen α level. Below are commonly used thresholds and their interpretations.
Highly Significant — Very strong evidence against the null hypothesis. Often denoted with *** in research papers.
Significant — Sufficient evidence to reject H₀ at the standard 5% level. The most common threshold in science.
Marginal — Weak evidence against H₀. Sometimes called “trending toward significance.” Use with caution.
Not Significant — Insufficient evidence to reject H₀. Does not prove H₀ is true, only that data is consistent with it.
| Distribution | Parameters | Common Use |
|---|---|---|
| Z (Normal) | None (standard) | Large samples, known σ, proportions |
| t (Student’s) | df = n − 1 | Small samples, unknown σ |
| χ² (Chi-square) | df = k − 1 | Categorical data, goodness-of-fit |
| F (Fisher) | df₁, df₂ | ANOVA, variance comparison |
Tests for an effect in a specific direction (e.g., mean is greater than or less than a value). All α is concentrated in one tail, giving more power for that direction.
Tests for any difference in either direction. The α is split between both tails (α/2 each). More conservative but appropriate when the direction is not pre-specified.
Important: The choice between one-tailed and two-tailed tests must be made before looking at the data. Choosing after seeing results inflates the false positive rate.
Incorrect. The p-value is the probability of the data (or more extreme) given H₀ is true. It is P(data | H₀), not P(H₀ | data). Bayesian methods are needed for the latter.
Incorrect. A p-value of 0.03 does not mean there is a 97% chance the alternative is true. The p-value tells you about the data under H₀, not the probability of any hypothesis.
Incorrect. Failure to reject H₀ does not prove H₀. It may simply mean the sample size was too small to detect the effect. “Absence of evidence is not evidence of absence.”
Correct: “Assuming H₀ is true, the probability of observing a test statistic as extreme as or more extreme than the one observed is p.” Always interpret in context.