Files
DMCT-NOTES/unit 2/02_Data_Science_Process.md
Akshat Mehta 8f8e35ae95 unit 2 added
2025-11-24 15:26:41 +05:30

38 lines
1.5 KiB
Markdown

# Standard Process for Data Science (CRISP-DM)
**CRISP-DM** stands for **Cr**oss **I**ndustry **S**tandard **P**rocess for **D**ata **M**ining. It is a standard way to do data mining projects.
It has **6 Phases**:
## 1. Business Understanding
**Goal**: Define what problem we are trying to solve.
- **Example**: An online retailer wants to classify items as "High Demand" or "Low Demand".
- **Questions**: Is item type related to demand? Can we predict demand accurately?
## 2. Data Understanding
**Goal**: Get to know the data.
- **Example**: Looking at the inventory data (orders, item type).
- **Insight**: Knowing if items are perishable (like milk) or non-perishable helps understand stock needs.
## 3. Data Preparation
**Goal**: Clean and format the data for the model.
- **Steps**:
- Handle missing values.
- Convert categories to numbers (dummy encoding).
- Check for connections (correlation) between variables.
## 4. Modeling
**Goal**: Build the machine learning model.
- We try to find a function that connects inputs (like number of orders) to the output (demand).
- We might try different models to find the best one.
## 5. Evaluation
**Goal**: Check how good the model is.
- We test the model on **unseen data** (data it hasn't seen before).
- We compare the **predicted** values with the **actual** values.
## 6. Deployment
**Goal**: Use the model in the real world.
- If the model is good, we put it to work.
- **Example**: Create an app where the retailer enters item details and gets a demand prediction.