1.5 KiB
1.5 KiB
Standard Process for Data Science (CRISP-DM)
CRISP-DM stands for Cross Industry Standard Process for Data Mining. It is a standard way to do data mining projects.
It has 6 Phases:
1. Business Understanding
Goal: Define what problem we are trying to solve.
- Example: An online retailer wants to classify items as "High Demand" or "Low Demand".
- Questions: Is item type related to demand? Can we predict demand accurately?
2. Data Understanding
Goal: Get to know the data.
- Example: Looking at the inventory data (orders, item type).
- Insight: Knowing if items are perishable (like milk) or non-perishable helps understand stock needs.
3. Data Preparation
Goal: Clean and format the data for the model.
- Steps:
- Handle missing values.
- Convert categories to numbers (dummy encoding).
- Check for connections (correlation) between variables.
4. Modeling
Goal: Build the machine learning model.
- We try to find a function that connects inputs (like number of orders) to the output (demand).
- We might try different models to find the best one.
5. Evaluation
Goal: Check how good the model is.
- We test the model on unseen data (data it hasn't seen before).
- We compare the predicted values with the actual values.
6. Deployment
Goal: Use the model in the real world.
- If the model is good, we put it to work.
- Example: Create an app where the retailer enters item details and gets a demand prediction.