30 lines
1.2 KiB
Markdown
30 lines
1.2 KiB
Markdown
# Naive Bayes Classifier
|
|
|
|
**Naive Bayes** is a classification algorithm based on **Bayes' Theorem**.
|
|
|
|
## Why "Naive"?
|
|
It is called "Naive" because it makes a simple assumption:
|
|
- **Assumption**: All features (predictors) are **independent** of each other.
|
|
- **Reality**: This is rarely true in real life, but the model still works surprisingly well.
|
|
|
|
## Bayes' Theorem
|
|
It calculates the probability of an event based on prior knowledge.
|
|
|
|
**Formula**:
|
|
`P(A|B) = (P(B|A) * P(A)) / P(B)`
|
|
|
|
- **P(A|B)**: **Posterior Probability** (Probability of class A given predictor B).
|
|
- **P(B|A)**: **Likelihood** (Probability of predictor B given class A).
|
|
- **P(A)**: **Prior Probability** (Probability of class A being true overall).
|
|
- **P(B)**: **Evidence** (Probability of predictor B occurring).
|
|
|
|
## Example: Spam Filtering
|
|
We want to label an email as **Spam** or **Ham** (Not Spam).
|
|
|
|
1. **Prior**: How common is spam overall? (e.g., 15% of emails are spam).
|
|
2. **Likelihood**: If an email is spam, how likely is it to contain the word "Money"?
|
|
3. **Evidence**: How common is the word "Money" in all emails?
|
|
4. **Posterior**: Given the email has "Money", what is the probability it is Spam?
|
|
|
|
We calculate this for all words and pick the class with the highest probability.
|