# Advanced Pattern Mining Beyond basic frequent itemsets, we have more efficient ways to represent patterns. ## 1. Closed and Maximal Patterns Finding *all* frequent itemsets can produce too many results. We can summarize them. ### Closed Frequent Itemset - An itemset is **Closed** if none of its supersets have the **same support**. - *Example*: - {Milk}: Support 4 - {Milk, Bread}: Support 4 - Here, {Milk} is NOT closed because adding Bread didn't change the support count. {Milk, Bread} captures the same information. - **Benefit**: Lossless compression (we don't lose any support info). ### Maximal Frequent Itemset - An itemset is **Maximal** if none of its supersets are **frequent**. - *Example*: - {Milk, Bread} is frequent. - {Milk, Bread, Diapers} is NOT frequent. - Then {Milk, Bread} is a Maximal Frequent Itemset. - **Benefit**: Smallest set of patterns, but we lose the exact support counts of subsets. ## 2. Vertical Data Format (Eclat Algorithm) - **Horizontal Format** (Standard): - T1: {A, B, C} - T2: {A, B} - **Vertical Format**: - Item A: {T1, T2} - Item B: {T1, T2} - Item C: {T1} - **How it works**: Instead of counting, we just intersect the Transaction ID lists. - Support(A, B) = Intersection({T1, T2}, {T1, T2}) = Size is 2. - **Benefit**: Very fast for calculating support.