What is Bayes theorem?

Bayes theorem calculates the probability of an event based on prior knowledge of conditions related to the event. The formula is P(A|B) = P(B|A) × P(A) / P(B). It allows updating beliefs as new evidence arrives, fundamental to Bayesian statistics and many machine learning algorithms.

What is a real-world example?

Medical testing: if a disease affects 1% of people (P(A) = 0.01), a test is 90% accurate (P(B|A) = 0.9), and 5% of all tests are positive (P(B) = 0.05), then P(disease|positive test) = 0.9 × 0.01 / 0.05 = 18%. Surprisingly low because of the rare condition — the "base rate fallacy."

What's the base rate fallacy?

Focusing on the accuracy of evidence (P(B|A)) while ignoring how common the condition is (P(A)). With rare conditions, even highly accurate tests have surprisingly low positive predictive value. Example: 99% accurate test for a 1-in-10,000 disease still has ~99% false positive rate among positive results.

How is Bayes used in machine learning?

Naive Bayes classifier: classifies by computing P(class | features) using Bayes' theorem. Bayesian neural networks: weights have probability distributions. Bayesian optimization: models expensive functions. Bayesian deep learning: combines Bayesian reasoning with neural networks for uncertainty quantification.

How do I choose a prior?

Several approaches: (1) Uninformative/flat priors: equal probability everywhere. (2) Weakly informative: minor information. (3) Informative: based on previous studies. (4) Empirical Bayes: data-driven. (5) Sensitivity analysis: check robustness to choice. For most applications, weakly informative priors with sensitivity analysis are recommended.

What's the difference between Bayesian and frequentist?

Bayesian: parameters are random variables with distributions; uses prior knowledge; produces credible intervals (P(parameter in range)). Frequentist: parameters are fixed; uses only data; produces confidence intervals (long-run interpretation). Both are legitimate; choice depends on context, available knowledge, and goals.

What is a Bayes factor?

Ratio of likelihood under one hypothesis to another: BF = P(data|H₁)/P(data|H₀). Updates prior odds to posterior odds. BF > 1 favors H₁. Strength: 1-3 weak, 3-10 moderate, 10-30 strong, 30-100 very strong, >100 decisive. Bayesian alternative to p-value.

Bayes Theorem Calculator

Enter the probability of B given A, the prior probability of A, and the probability of B to compute the posterior probability P(A|B) using Bayes theorem.

Bayes' theorem is the mathematical foundation for updating beliefs with new evidence. Given a prior belief about something (P(A)), and the probability of seeing evidence (B) given that belief is true (P(B|A)), Bayes' theorem calculates the updated belief after observing the evidence (P(A|B)). This sounds abstract but it's the basis of medical test interpretation, spam filtering, machine learning, scientific reasoning, and legal evidence evaluation.

This calculator returns P(A|B) given P(B|A), P(A), and P(B). The famous medical testing example: if a disease affects 1% of population (prior), the test is 90% accurate (likelihood), and 5% of all tests are positive (evidence), then P(disease | positive test) = 0.9 × 0.01 / 0.05 = 18% — far lower than the 90% accuracy might suggest, because of the low prior. This counter-intuitive result is the "base rate fallacy" in action.

Bayes' theorem is the bridge between probability and statistical inference. Frequentist statistics uses fixed parameters and random data; Bayesian statistics treats parameters as random variables with prior distributions, updated by data through Bayes' theorem. This is why Bayesian methods are popular in machine learning, decision science, and contemporary statistical practice.

Inputs

P(B|A) — Probability of B given A

P(A) — Prior probability of A

P(B) — Probability of B

Results

P(A|B)

0.180000

P(A|B) as %

18.00%

P(not A|B)

0.820000

Likelihood Ratio

21.7317

Prior Odds

0.010101

Posterior Odds

0.219512

Last updated: May 29, 2026

Formula

**Bayes' theorem:** P(A|B) = (P(B|A) × P(A)) / P(B) Where: - **P(A|B)**: posterior probability of A given B (what we want) - **P(B|A)**: likelihood, probability of evidence B given A is true - **P(A)**: prior probability of A - **P(B)**: probability of evidence B **Extended form (using law of total probability):** P(B) = P(B|A) × P(A) + P(B|not A) × P(not A) So Bayes can be written: P(A|B) = (P(B|A) × P(A)) / (P(B|A) × P(A) + P(B|not A) × P(not A)) **Medical testing example:** - P(disease) = 0.01 (1% population) - P(positive | disease) = 0.90 (90% sensitivity) - P(positive | no disease) = 0.05 (5% false positive rate) P(positive) = 0.90 × 0.01 + 0.05 × 0.99 = 0.009 + 0.0495 = 0.0585 P(disease | positive) = (0.90 × 0.01) / 0.0585 = 0.009 / 0.0585 = 0.154 = **15.4%** Even with a positive test, only 15.4% chance of actually having disease! This is the **base rate fallacy** — people focus on test accuracy and forget the rarity of the disease. **Worked example: spam filter** - P(email is spam) = 0.30 (30% of emails are spam) - P(contains "free" | spam) = 0.80 - P(contains "free" | not spam) = 0.10 P(contains "free") = 0.80 × 0.30 + 0.10 × 0.70 = 0.24 + 0.07 = 0.31 P(spam | contains "free") = (0.80 × 0.30) / 0.31 = 0.774 = 77.4% Email containing "free" is 77.4% likely spam. **Key Bayesian concepts:** | Term | Definition | |---|---| | Prior | P(A) - belief before evidence | | Likelihood | P(B|A) - probability of evidence given A | | Evidence | P(B) - probability of evidence | | Posterior | P(A|B) - updated belief | **Common applications:** | Field | Application | |---|---| | Medicine | Test interpretation, diagnosis | | Email | Spam filtering | | Search | Query relevance | | Forensics | DNA matching, evidence | | Machine learning | Naive Bayes classifier | | Insurance | Risk assessment | | Marketing | Customer behavior prediction | | Legal | Evidence evaluation | | Scientific | Hypothesis updating | **Base rate fallacy:** The medical test example demonstrates how people often ignore prior probabilities. Even highly accurate tests can have surprisingly low positive predictive value when the underlying condition is rare. **Bayesian updating:** Sequential Bayes updating: 1. Start with prior. 2. Observe evidence. 3. Calculate posterior. 4. New evidence: posterior becomes new prior. 5. Repeat. This is how learning happens — each piece of new evidence updates beliefs. **Bayesian vs frequentist:** | Bayesian | Frequentist | |---|---| | Parameters are random with distributions | Parameters are fixed | | Uses prior knowledge | Uses only data | | Credible intervals (probability about parameter) | Confidence intervals (frequentist meaning) | | Posterior probability | P-values | | Updates with each observation | Test on accumulated data | **Posterior odds form:** Posterior odds = Prior odds × Bayes factor Bayes factor = P(B|A) / P(B|not A) A Bayes factor > 1 favors A; < 1 favors not A. Strength of evidence: - 1-3: weak evidence - 3-10: moderate evidence - 10-30: strong evidence - 30-100: very strong - 100+: decisive **Prior selection:** Choosing priors: - **Uninformative (flat)**: equal probability for all values. - **Weakly informative**: slight tilt based on minimal information. - **Informative**: based on previous studies or domain knowledge. - **Empirical Bayes**: data-driven. - **Hierarchical**: combines data and prior. **Bayes in machine learning:** - **Naive Bayes classifier**: assumes features independent given class. - **Bayesian neural networks**: weights have distributions. - **Bayesian optimization**: model expensive functions. - **Topic models** (LDA): Bayesian models of text. **Common misuses:** - **Confusing P(A|B) with P(B|A)**: prosecutor's fallacy. - **Ignoring base rate**: focusing on accuracy of evidence. - **Choosing biased priors**: misleading results. - **Calculating without all components**: requires complete distribution. **Bayesian decision theory:** Bayes' theorem updates beliefs; decisions then minimize expected loss: Optimal action = arg min Σ Loss(action, state) × P(state | data) Combines probability with utility function.

How to use this calculator

Enter P(B|A): probability of B given A.
Enter P(A): prior probability of A.
Enter P(B): probability of B occurring.
Calculator returns P(A|B).
For test interpretation: P(A|B) is positive predictive value.
Verify all probabilities sum properly to 1.

Worked examples

Medical test interpretation

**Scenario:** Disease affects 1% of population. Test has 90% sensitivity and 5% false positive rate. If you test positive, what's probability of disease? **Calculation:** P(disease) = 0.01. P(positive | disease) = 0.90. P(positive) = 0.90 × 0.01 + 0.05 × 0.99 = 0.0585. P(disease | positive) = (0.90 × 0.01) / 0.0585 = 0.154. **Result:** Only 15.4% chance of actually having disease, despite positive test. Classic base rate fallacy: test accuracy alone (90%) is misleading when condition is rare.

Spam filter

**Scenario:** 30% of emails are spam. P(contains "discount" | spam) = 0.70. P(contains "discount" | not spam) = 0.05. **Calculation:** P(contains "discount") = 0.70 × 0.30 + 0.05 × 0.70 = 0.245. P(spam | contains "discount") = (0.70 × 0.30) / 0.245 = 0.857. **Result:** Email with "discount" is 85.7% likely spam. Helpful for spam classifier. Many such keywords combined dramatically improve accuracy.

Quality control

**Scenario:** 2% of parts are defective. Inspection has 95% detection rate, 1% false positive rate. P(actually defective | flagged by inspection)? **Calculation:** P(flagged | defective) = 0.95. P(defective) = 0.02. P(flagged) = 0.95 × 0.02 + 0.01 × 0.98 = 0.019 + 0.0098 = 0.0288. P(defective | flagged) = 0.019 / 0.0288 = 0.66. **Result:** Flagged parts are 66% likely defective. With low base rate, even good inspection produces 34% false flags. Investigation needed before disposal.

When to use this calculator

**Use Bayes' theorem for:**

- **Medical test interpretation**: positive predictive value. - **Spam filtering**: classification by content. - **Diagnostic testing**: evaluating positive results. - **Forensic analysis**: DNA matching. - **Marketing**: customer probability scoring. - **Insurance**: risk assessment with prior. - **Machine learning**: Naive Bayes classifiers. - **Decision making**: combining prior knowledge with evidence. - **Hypothesis testing**: Bayesian alternatives to p-values.

**Core insight:**

P(A|B) depends on: 1. P(A): how common A is overall (prior). 2. P(B|A): how often B occurs when A is true (sensitivity). 3. P(B): how often B occurs in general (which depends on rate of false positives).

**Base rate fallacy:**

When P(A) (prior) is very low, even highly accurate evidence (P(B|A) close to 1) doesn't make P(A|B) close to 1. The denominator includes substantial P(B|not A) × P(not A), keeping posterior low.

Quote: "Extraordinary claims require extraordinary evidence." — implication of Bayes.

**Updating beliefs:**

When new evidence arrives: 1. Old posterior becomes new prior. 2. New evidence applied via Bayes. 3. New posterior reflects updated belief.

This is the essence of Bayesian learning.

**Bayes factor:**

The "weight of evidence" provided by data:

BF = P(B|A) / P(B|not A)

Update odds: post-odds = prior-odds × BF.

Strength: - BF > 100: decisive - 10-100: strong - 3-10: moderate - 1-3: weak

**Selecting priors:**

Issue: priors are subjective.

Approaches: - **Reference priors**: minimally informative. - **Empirical Bayes**: data-driven priors. - **Hierarchical models**: combine priors with data. - **Sensitivity analysis**: check robustness to prior choice.

**Common errors:**

- **Confusing P(A|B) with P(B|A)** (prosecutor's fallacy). - **Ignoring base rate**. - **Choosing biased priors**. - **Forgetting to include all alternatives**. - **Confusing prior with frequency**.

**Real-world applications:**

1. **Cancer screening**: positive test result doesn't mean cancer at population scale due to low base rate.

2. **Drug testing**: positive drug screen needs Bayesian interpretation given base rate.

3. **Court evidence**: DNA evidence requires Bayesian reasoning to translate match probabilities to probability of identity.

4. **Spam filtering**: Naive Bayes classifies emails by word frequencies.

5. **Spelling correction**: probability of intended word given typed word.

6. **Recommendations**: Netflix, Amazon use Bayesian reasoning.

**Software:**

- **Excel**: Formula calculation. - **R**: Bayesian packages (rstan, brms, JAGS). - **Python**: PyMC, scipy.stats, scikit-learn Naive Bayes. - **Statistical software**: dedicated Bayesian tools.

**Bayesian vs frequentist:**

Both legitimate; choose based on context:

| Choose Bayesian when... | Choose frequentist when... | |---|---| | Prior knowledge available | No prior or want pure data analysis | | Need probability about parameters | Standard tests sufficient | | Sequential updating needed | Single analysis OK | | Decision under uncertainty | Standard hypothesis testing | | Complex models | Simple models, standard data |

**Key Bayesian distributions:**

- **Beta**: prior for proportions. - **Gamma**: prior for rates. - **Normal**: prior for means. - **Dirichlet**: prior for multinomial.

These have nice mathematical properties (conjugate priors) for Bayesian analysis.

**Modern Bayesian methods:**

- **MCMC (Markov Chain Monte Carlo)**: simulating from posteriors. - **Variational inference**: approximating posteriors. - **Hamiltonian Monte Carlo**: more efficient sampling.

**Practical Bayes:**

For most users: - Use software with built-in Bayesian methods. - Sensitivity analysis on prior choice. - Report Bayes factors or posterior probabilities. - Compare with frequentist results. - Be transparent about assumptions.

Common mistakes to avoid

Confusing P(A|B) with P(B|A). Always check which conditional probability you have.
Ignoring base rate (P(A)). Common in medical test interpretation.
Computing without all components. Need prior, likelihood, evidence.
Mixing up prosecutor's fallacy. Evidence given guilt ≠ guilt given evidence.
Forgetting probabilities must sum to 1.
Inappropriate priors. Affects results significantly.
Using on inappropriate data (e.g., non-categorical).

Bayes Theorem Calculator

Inputs

Results

Formula

How to use this calculator

Worked examples

Medical test interpretation

Spam filter

Quality control

When to use this calculator

Common mistakes to avoid

Frequently Asked Questions

Sources & further reading

Related Calculators

Probability Calculator

Binomial Distribution Calculator

Hypothesis Testing Calculator