unit 2 added
This commit is contained in:
29
unit 2/07_Naive_Bayes.md
Normal file
29
unit 2/07_Naive_Bayes.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Naive Bayes Classifier
|
||||
|
||||
**Naive Bayes** is a classification algorithm based on **Bayes' Theorem**.
|
||||
|
||||
## Why "Naive"?
|
||||
It is called "Naive" because it makes a simple assumption:
|
||||
- **Assumption**: All features (predictors) are **independent** of each other.
|
||||
- **Reality**: This is rarely true in real life, but the model still works surprisingly well.
|
||||
|
||||
## Bayes' Theorem
|
||||
It calculates the probability of an event based on prior knowledge.
|
||||
|
||||
**Formula**:
|
||||
`P(A|B) = (P(B|A) * P(A)) / P(B)`
|
||||
|
||||
- **P(A|B)**: **Posterior Probability** (Probability of class A given predictor B).
|
||||
- **P(B|A)**: **Likelihood** (Probability of predictor B given class A).
|
||||
- **P(A)**: **Prior Probability** (Probability of class A being true overall).
|
||||
- **P(B)**: **Evidence** (Probability of predictor B occurring).
|
||||
|
||||
## Example: Spam Filtering
|
||||
We want to label an email as **Spam** or **Ham** (Not Spam).
|
||||
|
||||
1. **Prior**: How common is spam overall? (e.g., 15% of emails are spam).
|
||||
2. **Likelihood**: If an email is spam, how likely is it to contain the word "Money"?
|
||||
3. **Evidence**: How common is the word "Money" in all emails?
|
||||
4. **Posterior**: Given the email has "Money", what is the probability it is Spam?
|
||||
|
||||
We calculate this for all words and pick the class with the highest probability.
|
||||
Reference in New Issue
Block a user