1001 B
1001 B
The Data Mining Process
How do we actually do data mining? It follows a standard process (often similar to CRISP-DM).
Steps in the Process
- Define the Goal: What do you want to achieve? (e.g., Increase sales, detect fraud).
- Gather Data: Collect data from databases, logs, etc.
- Cleanse Data: Fix errors, remove duplicates, and handle missing values.
- Interrogate Data: Explore the data (charts, graphs) to find initial patterns.
- Build a Model: Use algorithms (like decision trees or regression) to find the solution.
- Validate Results: Check if the model is accurate.
- Implement: Use the insights in the real world.
Data Mining Functionalities
Tasks are generally divided into two types:
1. Descriptive Mining
- Describes what is in the data.
- Finds patterns and relationships.
- Examples: Clustering, Association Rules.
2. Predictive Mining
- Predicts future or unknown values.
- Examples: Classification, Regression, Prediction.