# Model Evaluation Metrics How do we know if our classification model is good? We use several metrics. ## Confusion Matrix A table that compares **Predicted** values with **Actual** values. | | Predicted Negative | Predicted Positive | |---|---|---| | **Actual Negative** | True Negative (TN) | False Positive (FP) | | **Actual Positive** | False Negative (FN) | True Positive (TP) | - **TP**: Correctly predicted positive. - **TN**: Correctly predicted negative. - **FP**: Incorrectly predicted positive (Type I Error). - **FN**: Incorrectly predicted negative (Type II Error). ## Key Metrics ### 1. Accuracy - Fraction of **all** correct predictions. - **Formula**: `(TP + TN) / Total` - **Problem**: Not reliable if data is imbalanced (Accuracy Paradox). ### 2. Precision - Out of all predicted positives, how many were actually positive? - **Formula**: `TP / (TP + FP)` - Higher is better. ### 3. Recall (Sensitivity / TPR) - Out of all **actual** positives, how many did we find? - **Formula**: `TP / (TP + FN)` - Higher is better. ### 4. Specificity - Out of all **actual** negatives, how many did we correctly identify? - **Formula**: `TN / (TN + FP)` ### 5. F1 Score - The harmonic mean of Precision and Recall. - Good for balancing precision and recall, especially with uneven classes. - **Formula**: `2 * (Precision * Recall) / (Precision + Recall)` ## ROC and AUC ### ROC Curve (Receiver Operating Characteristic) - A plot of **TPR (Recall)** vs **FPR (False Positive Rate)**. - Shows how the model performs at different thresholds. ### AUC (Area Under the Curve) - Measures the entire area underneath the ROC curve. - **Range**: 0 to 1. - **Interpretation**: Higher AUC means the model is better at distinguishing between classes.