unit 2 added

2025-11-24 15:26:41 +05:30
commit 8f8e35ae95
9 changed files with 326 additions and 0 deletions
--- a/2/08_Decision_Tree.md
+++ b/2/08_Decision_Tree.md
@@ -0,0 +1,33 @@
+# Decision Tree Algorithm
+
+A **Decision Tree** is like a flowchart used for making decisions. It splits data into smaller groups based on rules.
+
+## Structure
+- **Root Node**: The starting point. It represents the entire dataset.
+- **Decision Nodes**: Points where the data is split based on a question (e.g., "Is Petal Length < 2.45?").
+- **Leaf Nodes (Terminal Nodes)**: The final output (class label) where no more splits happen.
+
+## How it Splits Data
+The tree wants to make the groups as "pure" as possible (containing only one class).
+
+### Splitting Criteria
+1. **Gini Impurity** (Default):
+   - Measures how mixed the classes are.
+   - **0** = Pure (all same class).
+   - **0.5** = Impure (mixed classes).
+   - The tree tries to **minimize** Gini.
+
+2. **Entropy**:
+   - Measures disorder or randomness.
+   - **0** = Pure.
+   - **1** = Highly disordered.
+   - The tree tries to **reduce** Entropy (maximize Information Gain).
+
+3. **Information Gain**:
+   - The difference in Entropy before and after a split.
+   - We choose the split that gives the **highest** Information Gain.
+
+## Parameters to Control the Tree
+- **max_depth**: How deep the tree can grow. (Too deep = Overfitting).
+- **min_samples_split**: Minimum samples needed to split a node.
+- **max_features**: Number of features to consider for each split.