DMCT-NOTES/unit 2/04_Model_Evaluation.md

# Model Evaluation Metrics

How do we know if our classification model is good? We use several metrics.

## Confusion Matrix
A table that compares **Predicted** values with **Actual** values.

| | Predicted Negative | Predicted Positive |
|---|---|---|
| **Actual Negative** | True Negative (TN) | False Positive (FP) |
| **Actual Positive** | False Negative (FN) | True Positive (TP) |

- **TP**: Correctly predicted positive.
- **TN**: Correctly predicted negative.
- **FP**: Incorrectly predicted positive (Type I Error).
- **FN**: Incorrectly predicted negative (Type II Error).

## Key Metrics

### 1. Accuracy
- Fraction of **all** correct predictions.
- **Formula**: `(TP + TN) / Total`
- **Problem**: Not reliable if data is imbalanced (Accuracy Paradox).

### 2. Precision
- Out of all predicted positives, how many were actually positive?
- **Formula**: `TP / (TP + FP)`
- Higher is better.

### 3. Recall (Sensitivity / TPR)
- Out of all **actual** positives, how many did we find?
- **Formula**: `TP / (TP + FN)`
- Higher is better.

### 4. Specificity
- Out of all **actual** negatives, how many did we correctly identify?
- **Formula**: `TN / (TN + FP)`

### 5. F1 Score
- The harmonic mean of Precision and Recall.
- Good for balancing precision and recall, especially with uneven classes.
- **Formula**: `2 * (Precision * Recall) / (Precision + Recall)`

## ROC and AUC

### ROC Curve (Receiver Operating Characteristic)
- A plot of **TPR (Recall)** vs **FPR (False Positive Rate)**.
- Shows how the model performs at different thresholds.

### AUC (Area Under the Curve)
- Measures the entire area underneath the ROC curve.
- **Range**: 0 to 1.
- **Interpretation**: Higher AUC means the model is better at distinguishing between classes.