addition of unit 1 3 4 5
This commit is contained in:
27
unit 1/01_Introduction_to_Data_Mining.md
Normal file
27
unit 1/01_Introduction_to_Data_Mining.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# Introduction to Data Mining
|
||||
|
||||
## What is Data Mining?
|
||||
**Data Mining** is the process of digging through large amounts of raw data to find useful patterns, trends, and knowledge.
|
||||
- **Analogy**: Like mining gold from rocks. The rocks are the "raw data," and the gold is the "knowledge."
|
||||
|
||||
### Key Definitions
|
||||
- **Data**: Raw facts and figures (e.g., sales logs, sensor readings).
|
||||
- **Mining**: Extracting something valuable.
|
||||
|
||||
## The DIKW Pyramid
|
||||
The **DIKW** model shows how we move from raw data to wisdom.
|
||||
|
||||
1. **Data (D)**: Raw, unprocessed facts.
|
||||
- *Example*: Numbers like 42, 35, 50.
|
||||
2. **Information (I)**: Data that is organized and has meaning.
|
||||
- *Example*: "These are the ages of employees."
|
||||
3. **Knowledge (K)**: Understanding gained from analysis.
|
||||
- *Example*: "The team has a mix of young and experienced people."
|
||||
4. **Wisdom (W)**: Applying knowledge to make good decisions.
|
||||
- *Example*: "Let's create a mentorship program to share skills."
|
||||
|
||||
## Major Issues in Data Mining
|
||||
1. **Privacy and Security**: Mining can reveal sensitive personal info. We must protect it.
|
||||
2. **Scalability**: Can the system handle huge amounts of data (Big Data)?
|
||||
3. **Data Quality**: If data is dirty or missing, the results will be wrong ("Garbage In, Garbage Out").
|
||||
4. **Ethical Use**: Ensuring data isn't used for discrimination or bias.
|
||||
Reference in New Issue
Block a user