DMCT-NOTES/unit 2/07_Naive_Bayes.md

# Naive Bayes Classifier

**Naive Bayes** is a classification algorithm based on **Bayes' Theorem**.

## Why "Naive"?
It is called "Naive" because it makes a simple assumption:
- **Assumption**: All features (predictors) are **independent** of each other.
- **Reality**: This is rarely true in real life, but the model still works surprisingly well.

## Bayes' Theorem
It calculates the probability of an event based on prior knowledge.

**Formula**:
`P(A|B) = (P(B|A) * P(A)) / P(B)`

- **P(A|B)**: **Posterior Probability** (Probability of class A given predictor B).
- **P(B|A)**: **Likelihood** (Probability of predictor B given class A).
- **P(A)**: **Prior Probability** (Probability of class A being true overall).
- **P(B)**: **Evidence** (Probability of predictor B occurring).

## Example: Spam Filtering
We want to label an email as **Spam** or **Ham** (Not Spam).

1. **Prior**: How common is spam overall? (e.g., 15% of emails are spam).
2. **Likelihood**: If an email is spam, how likely is it to contain the word "Money"?
3. **Evidence**: How common is the word "Money" in all emails?
4. **Posterior**: Given the email has "Money", what is the probability it is Spam?

We calculate this for all words and pick the class with the highest probability.