Chi-Square Test Calculator
Enter observed and expected frequencies for up to 6 categories to compute the chi-square statistic and approximate p-value. Determines whether the observed distribution significantly differs from the expected distribution.
The chi-square (χ²) goodness-of-fit test compares observed frequencies in categorical data to what you'd expect under a hypothesized distribution. It answers questions like: are the dice fair? Do customers prefer products equally? Is genetic inheritance following predicted ratios? Are survey respondents distributed as expected across age groups? The test sums standardized squared differences between observed and expected counts.
This calculator returns the chi-square statistic, degrees of freedom, and approximate p-value. Larger χ² values indicate larger discrepancies between observed and expected, suggesting the hypothesized distribution is wrong. The test is widely used in quality control, market research, genetics, and contingency table analysis.
Chi-square assumes: independent observations, categorical data (counts, not proportions), expected frequencies of at least 5 in each cell (for accuracy), and random sampling. When expected frequencies are very small, alternatives include Fisher's exact test or pooling categories.
The test has two main forms: goodness-of-fit (this calculator: comparing one set of observed counts to expected) and contingency table (testing if two categorical variables are independent). Both use the same fundamental χ² distribution but with different setups and degrees of freedom.
Inputs
Results
Chi-Square Statistic
5.0000
Degrees of Freedom
2
P-Value
0.082085
Decision
Not significant (p >= 0.05)
Categories
3
Formula
How to use this calculator
- Enter observed counts in each category.
- Enter expected counts in each category.
- Calculator returns chi-square, df, p-value.
- Compare p to significance level (α = 0.05 typical).
- Ensure expected frequencies ≥ 5 in each cell.
- For 2×2 tables: consider Yates correction.
Worked examples
Dice fairness test
**Scenario:** Roll die 60 times. Got: 8, 12, 9, 11, 10, 10. Expected: 10 each. **Calculation:** χ² = 0.4 + 0.4 + 0.1 + 0.1 + 0 + 0 = 1.0. df = 5. P ≈ 0.96. **Result:** Highly non-significant. Dice behave as expected if fair. Differences from 10 each face are within random variation.
Customer preference
**Scenario:** Survey 200 customers. Expected (under null hypothesis of equal preference): 50 each for 4 products. Observed: 75, 60, 35, 30. **Calculation:** χ² = (25)²/50 + (10)²/50 + (15)²/50 + (20)²/50 = 12.5 + 2 + 4.5 + 8 = 27. df = 3. P < 0.001. **Result:** Highly significant. Customer preferences not equal across products. Products 1 and 2 favored over 3 and 4. Consider this for marketing strategy.
Genetic inheritance
**Scenario:** Cross of heterozygous plants. Mendel's 9:3:3:1 ratio expected. Sample 320 offspring. Got: 180, 60, 65, 15. **Calculation:** Expected: 180, 60, 60, 20. χ² = (0)²/180 + (0)²/60 + (5)²/60 + (5)²/20 = 0 + 0 + 0.42 + 1.25 = 1.67. df = 3. P ≈ 0.65. **Result:** Consistent with 9:3:3:1 ratio (Mendel's prediction). No evidence against the genetic theory. Lab results support theoretical expectation.
When to use this calculator
**Use chi-square goodness-of-fit for:**
- **Categorical data**: counts in categories. - **Testing hypothesized distributions**: equal probabilities, specific ratios. - **Genetics**: testing Mendelian ratios. - **Quality control**: defect distribution. - **Survey analysis**: response distribution vs expected. - **Polling**: voting patterns.
**Use contingency chi-square for:**
- **Testing independence** of two categorical variables. - **Cross-tabulation** analysis. - **2-way classification** of data.
**Don't use for:**
- Continuous data (use t-test, ANOVA, regression). - Very small expected frequencies (< 5). - Paired data (use McNemar test). - Ordinal data (consider rank tests).
**Effect size interpretation:**
Cramer's V for contingency tables: - 0.1: small effect - 0.3: medium effect - 0.5: large effect
For goodness-of-fit, effect size less standardized.
**Common applications:**
- **Mendel's experiments**: testing genetic theories. - **Marketing**: testing if customer preferences match expected. - **Polling**: testing demographic representation. - **Quality control**: testing if defect rates match specs. - **Healthcare**: testing if disease rates differ across groups. - **Education**: testing if grade distribution matches expectation.
**Sample size considerations:**
- Larger samples → more power to detect small deviations. - Expected frequencies need to be ≥5 for accuracy. - For very small samples: Fisher exact (2×2) or simulation. - For very large samples: even tiny differences become statistically significant.
**Yates correction:**
For 2×2 contingency tables with small samples, some recommend Yates correction (subtract 0.5 from |O-E| before squaring). This reduces overestimation but may be too conservative for modern data. Software usually applies it automatically when sample is small.
**Reporting:**
Standard format: "A chi-square test of [association/goodness-of-fit] revealed [significance], χ²(df) = X, p = Y."
Then describe the nature of the differences.
**Software tips:**
- Excel: CHISQ.TEST(observed, expected) for goodness-of-fit. - R: chisq.test(x = observed, p = expected_probabilities). - Python: scipy.stats.chisquare(observed, expected). - SPSS: Analyze → Nonparametric → Chi-Square.
**Power analysis:**
For planning sample size: - Effect size (Cohen's w or Cramer's V). - df. - Significance level (α). - Target power (80% typical).
Software (G*Power, pwr in R) calculates required n.
**Common errors:**
- Using chi-square on percentages (need counts). - Not specifying expected distribution. - Small expected frequencies. - Multiple testing without correction. - Confusing observed and expected.
**Alternative tests:**
- **Fisher's exact**: 2×2, small samples. - **Likelihood ratio (G-test)**: alternative formulation. - **McNemar**: paired binary outcomes. - **Cochran's Q**: related multiple binary tests.
**Beyond basic chi-square:**
- **Three-way contingency tables**: log-linear models. - **Continuous predictors with categorical outcomes**: logistic regression. - **Multinomial outcomes**: extended chi-square. - **Repeated measures**: McNemar, Cochran.
Common mistakes to avoid
- Using percentages instead of counts. Chi-square needs raw counts.
- Expected frequencies < 5. Reduces accuracy; pool categories or use Fisher exact.
- Forgetting to specify expected distribution.
- Treating chi-square as correlation measure. It tests association/fit, not strength.
- Significant test without effect size or context.
- Using on continuous data. Wrong test entirely.
- Multiple testing without correction.
Frequently Asked Questions
Sources & further reading
Related Calculators
P-Value Calculator
Calculate the p-value from a z-score or t-score for hypothesis testing.
Hypothesis Testing Calculator
Perform a z-test for hypothesis testing with a decision at your chosen significance level.
Standard Deviation Calculator
Calculate population and sample standard deviation from a data set.