54 lines
1.7 KiB
Markdown
54 lines
1.7 KiB
Markdown
# Model Evaluation Metrics
|
|
|
|
How do we know if our classification model is good? We use several metrics.
|
|
|
|
## Confusion Matrix
|
|
A table that compares **Predicted** values with **Actual** values.
|
|
|
|
| | Predicted Negative | Predicted Positive |
|
|
|---|---|---|
|
|
| **Actual Negative** | True Negative (TN) | False Positive (FP) |
|
|
| **Actual Positive** | False Negative (FN) | True Positive (TP) |
|
|
|
|
- **TP**: Correctly predicted positive.
|
|
- **TN**: Correctly predicted negative.
|
|
- **FP**: Incorrectly predicted positive (Type I Error).
|
|
- **FN**: Incorrectly predicted negative (Type II Error).
|
|
|
|
## Key Metrics
|
|
|
|
### 1. Accuracy
|
|
- Fraction of **all** correct predictions.
|
|
- **Formula**: `(TP + TN) / Total`
|
|
- **Problem**: Not reliable if data is imbalanced (Accuracy Paradox).
|
|
|
|
### 2. Precision
|
|
- Out of all predicted positives, how many were actually positive?
|
|
- **Formula**: `TP / (TP + FP)`
|
|
- Higher is better.
|
|
|
|
### 3. Recall (Sensitivity / TPR)
|
|
- Out of all **actual** positives, how many did we find?
|
|
- **Formula**: `TP / (TP + FN)`
|
|
- Higher is better.
|
|
|
|
### 4. Specificity
|
|
- Out of all **actual** negatives, how many did we correctly identify?
|
|
- **Formula**: `TN / (TN + FP)`
|
|
|
|
### 5. F1 Score
|
|
- The harmonic mean of Precision and Recall.
|
|
- Good for balancing precision and recall, especially with uneven classes.
|
|
- **Formula**: `2 * (Precision * Recall) / (Precision + Recall)`
|
|
|
|
## ROC and AUC
|
|
|
|
### ROC Curve (Receiver Operating Characteristic)
|
|
- A plot of **TPR (Recall)** vs **FPR (False Positive Rate)**.
|
|
- Shows how the model performs at different thresholds.
|
|
|
|
### AUC (Area Under the Curve)
|
|
- Measures the entire area underneath the ROC curve.
|
|
- **Range**: 0 to 1.
|
|
- **Interpretation**: Higher AUC means the model is better at distinguishing between classes.
|