addition of unit 1 3 4 5

This commit is contained in:
Akshat Mehta
2025-11-24 16:55:19 +05:30
parent 8f8e35ae95
commit f8aea15aaa
24 changed files with 596 additions and 0 deletions

View File

@@ -0,0 +1,27 @@
# Introduction to Data Mining
## What is Data Mining?
**Data Mining** is the process of digging through large amounts of raw data to find useful patterns, trends, and knowledge.
- **Analogy**: Like mining gold from rocks. The rocks are the "raw data," and the gold is the "knowledge."
### Key Definitions
- **Data**: Raw facts and figures (e.g., sales logs, sensor readings).
- **Mining**: Extracting something valuable.
## The DIKW Pyramid
The **DIKW** model shows how we move from raw data to wisdom.
1. **Data (D)**: Raw, unprocessed facts.
- *Example*: Numbers like 42, 35, 50.
2. **Information (I)**: Data that is organized and has meaning.
- *Example*: "These are the ages of employees."
3. **Knowledge (K)**: Understanding gained from analysis.
- *Example*: "The team has a mix of young and experienced people."
4. **Wisdom (W)**: Applying knowledge to make good decisions.
- *Example*: "Let's create a mentorship program to share skills."
## Major Issues in Data Mining
1. **Privacy and Security**: Mining can reveal sensitive personal info. We must protect it.
2. **Scalability**: Can the system handle huge amounts of data (Big Data)?
3. **Data Quality**: If data is dirty or missing, the results will be wrong ("Garbage In, Garbage Out").
4. **Ethical Use**: Ensuring data isn't used for discrimination or bias.