Files
DMCT-NOTES/unit 1/02_Data_Mining_Process.md
2025-11-24 16:55:19 +05:30

1001 B

The Data Mining Process

How do we actually do data mining? It follows a standard process (often similar to CRISP-DM).

Steps in the Process

  1. Define the Goal: What do you want to achieve? (e.g., Increase sales, detect fraud).
  2. Gather Data: Collect data from databases, logs, etc.
  3. Cleanse Data: Fix errors, remove duplicates, and handle missing values.
  4. Interrogate Data: Explore the data (charts, graphs) to find initial patterns.
  5. Build a Model: Use algorithms (like decision trees or regression) to find the solution.
  6. Validate Results: Check if the model is accurate.
  7. Implement: Use the insights in the real world.

Data Mining Functionalities

Tasks are generally divided into two types:

1. Descriptive Mining

  • Describes what is in the data.
  • Finds patterns and relationships.
  • Examples: Clustering, Association Rules.

2. Predictive Mining

  • Predicts future or unknown values.
  • Examples: Classification, Regression, Prediction.