Files
DMCT-NOTES/unit 3/03_FP_Growth_Algorithm.md
2025-11-24 16:55:19 +05:30

1.2 KiB

FP-Growth Algorithm

FP-Growth (Frequent Pattern Growth) is a faster and more efficient algorithm than Apriori.

Why is it better?

  • Apriori scans the database many times (once for size 1, once for size 2, etc.). This is slow.
  • FP-Growth scans the database only twice.
    1. First scan: Count frequencies.
    2. Second scan: Build the FP-Tree.

The FP-Tree

An FP-Tree (Frequent Pattern Tree) is a compressed tree structure that stores all the transaction information.

  • More frequent items are near the root (top).
  • Less frequent items are leaves (bottom).

Steps

  1. Count Frequencies: Find support for all items. Drop infrequent ones.
  2. Sort: Sort items in each transaction by frequency (Highest to Lowest).
  3. Build Tree: Insert transactions into the tree. Shared items share the same path.
  4. Mine Tree:
    • Start from the bottom (least frequent item).
    • Build a "Conditional Pattern Base" (all paths leading to that item).
    • Construct a "Conditional FP-Tree".
    • Recursively find frequent patterns.

Divide and Conquer

FP-Growth uses a Divide and Conquer strategy. It breaks the problem into smaller sub-problems (conditional trees) rather than generating millions of candidate sets like Apriori.