# Data Discretization **Data Discretization** is the process of converting a large number of continuous values into a smaller number of finite intervals (bins). ## Why use it? - Makes data easier to understand. - Many algorithms work better with categories than infinite numbers. ## Techniques ### 1. Binning - Sorting data and dividing it into "bins". - **Example**: Grouping ages into [0-10], [11-20], etc. - Helps smooth out noise. ### 2. Histogram Analysis - Using a bar chart (histogram) to see the distribution and decide where to split the data. ### 3. Cluster Analysis - Using clustering (like K-Means) to group similar values, then using those groups as the intervals. ## Concept Hierarchy - Organizing data from **low-level** concepts to **high-level** concepts. - **Example (Location)**: - Street -> City -> State -> Country. - **Top-down Mapping**: General to Specific. - **Bottom-up Mapping**: Specific to General.