DMCT-NOTES/unit 3/03_FP_Growth_Algorithm.md

# FP-Growth Algorithm

**FP-Growth (Frequent Pattern Growth)** is a faster and more efficient algorithm than Apriori.

## Why is it better?
- **Apriori** scans the database many times (once for size 1, once for size 2, etc.). This is slow.
- **FP-Growth** scans the database **only twice**.
  1. First scan: Count frequencies.
  2. Second scan: Build the **FP-Tree**.

## The FP-Tree
An **FP-Tree** (Frequent Pattern Tree) is a compressed tree structure that stores all the transaction information.
- More frequent items are near the root (top).
- Less frequent items are leaves (bottom).

## Steps
1. **Count Frequencies**: Find support for all items. Drop infrequent ones.
2. **Sort**: Sort items in each transaction by frequency (Highest to Lowest).
3. **Build Tree**: Insert transactions into the tree. Shared items share the same path.
4. **Mine Tree**:
   - Start from the bottom (least frequent item).
   - Build a "Conditional Pattern Base" (all paths leading to that item).
   - Construct a "Conditional FP-Tree".
   - Recursively find frequent patterns.

## Divide and Conquer
FP-Growth uses a **Divide and Conquer** strategy. It breaks the problem into smaller sub-problems (conditional trees) rather than generating millions of candidate sets like Apriori.