1.2 KiB
1.2 KiB
FP-Growth Algorithm
FP-Growth (Frequent Pattern Growth) is a faster and more efficient algorithm than Apriori.
Why is it better?
- Apriori scans the database many times (once for size 1, once for size 2, etc.). This is slow.
- FP-Growth scans the database only twice.
- First scan: Count frequencies.
- Second scan: Build the FP-Tree.
The FP-Tree
An FP-Tree (Frequent Pattern Tree) is a compressed tree structure that stores all the transaction information.
- More frequent items are near the root (top).
- Less frequent items are leaves (bottom).
Steps
- Count Frequencies: Find support for all items. Drop infrequent ones.
- Sort: Sort items in each transaction by frequency (Highest to Lowest).
- Build Tree: Insert transactions into the tree. Shared items share the same path.
- Mine Tree:
- Start from the bottom (least frequent item).
- Build a "Conditional Pattern Base" (all paths leading to that item).
- Construct a "Conditional FP-Tree".
- Recursively find frequent patterns.
Divide and Conquer
FP-Growth uses a Divide and Conquer strategy. It breaks the problem into smaller sub-problems (conditional trees) rather than generating millions of candidate sets like Apriori.