SlideShare a Scribd company logo
1 of 52
Download to read offline
UNIT-2 Data Preprocessing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Lecture-13 Why preprocess the data?
Lecture-13 Why Data Preprocessing? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-13 Why Data Preprocessing?
Multi-Dimensional Measure of Data Quality ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-13 Why Data Preprocessing?
Major Tasks in Data Preprocessing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-13 Why Data Preprocessing?
Forms of data preprocessing   Lecture-13 Why Data Preprocessing?
[object Object],[object Object]
Data Cleaning ,[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
Missing Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
How to Handle Missing Data? ,[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
How to Handle Missing Data? ,[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
Noisy Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
How to Handle Noisy Data? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
Simple Discretization Methods: Binning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
Binning Methods for Data Smoothing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-14 - Data cleaning
Cluster Analysis Lecture-14 - Data cleaning
Regression x y y = x + 1 X1 Y1 Y1’ Lecture-14 - Data cleaning
[object Object],[object Object]
Data Integration ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-15 - Data integration and transformation
Handling Redundant Data in Data Integration ,[object Object],[object Object],[object Object],Lecture-15 - Data integration and transformation
Handling Redundant Data in Data Integration ,[object Object],[object Object],Lecture-15 - Data integration and transformation
Data Transformation ,[object Object],[object Object],[object Object],Lecture-15 - Data integration and transformation
Data Transformation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-15 - Data integration and transformation
Data Transformation: Normalization ,[object Object],[object Object],[object Object],Where  j  is the smallest integer such that Max(|  |)<1 Lecture-15 - Data integration and transformation
[object Object],[object Object]
Data Reduction  ,[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Data Reduction Strategies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Data Cube Aggregation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Dimensionality Reduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Wavelet Transforms  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction Haar2 Daubechie4
[object Object],[object Object],[object Object],[object Object],[object Object],Principal Component Analysis   Lecture-16 - Data reduction
X1 X2 Y1 Y2 Principal Component Analysis Lecture-16 - Data reduction
Attribute subset selection ,[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Heuristic  Selection Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Example of Decision Tree Induction Initial attribute set: {A1, A2, A3, A4, A5, A6} A4 ? A1? A6? Class 1 Class 2 Class 1 Class 2 Reduced attribute set:  {A1, A4, A6} Lecture-16 - Data reduction >
Numerosity Reduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Regression and Log-Linear Models ,[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Regress Analysis and Log-Linear Models Lecture-16 - Data reduction
Histograms ,[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Clustering ,[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Sampling ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-16 - Data reduction
Sampling SRSWOR (simple random sample without  replacement) SRSWR Lecture-16 - Data reduction Raw Data
Sampling Raw Data  Cluster/Stratified Sample Lecture-16 - Data reduction
[object Object],[object Object]
Discretization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Discretization and Concept hierachy ,[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Discretization and concept hierarchy generation for numeric data ,[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Entropy-Based Discretization ,[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Discretization by intuitive partitioning ,[object Object],[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Example of 3-4-5 rule (-$4000 -$5,000) Step 4: Lecture-17 - Discretization and concept hierarchy generation (-$400 - 0) (-$400 - -$300) (-$300 -  -$200) (-$200 - -$100) (-$100 - 0) (0 - $1,000) (0 -  $200) ($200 - $400) ($400 - $600) ($600 - $800) ($800 - $1,000) ($2,000 - $5, 000) ($2,000 - $3,000) ($3,000 - $4,000) ($4,000 - $5,000) ($1,000 - $2, 000) ($1,000 - $1,200) ($1,200 - $1,400) ($1,400 - $1,600) ($1,600 - $1,800) ($1,800 - $2,000) msd=1,000 Low=-$1,000 High=$2,000 Step 2: Step 1: -$351 -$159 profit   $1,838   $4,700 Min  Low (i.e, 5%-tile)   High(i.e, 95%-0 tile)  Max count (-$1,000  - $2,000) (-$1,000 - 0) (0 -$ 1,000) Step 3: ($1,000 - $2,000)
Concept hierarchy generation for categorical data ,[object Object],[object Object],[object Object],[object Object],Lecture-17 - Discretization and concept hierarchy generation
Specification of a set of attributes ,[object Object],country province_or_ state city street 15 distinct values 65 distinct values 3567 distinct values 674,339 distinct values Lecture-17 - Discretization and concept hierarchy generation

More Related Content

What's hot

Data preprocessing
Data preprocessingData preprocessing
Data preprocessingksamyMCA
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingHarry Potter
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingHarry Potter
 
03 preprocessing
03 preprocessing03 preprocessing
03 preprocessingpurnimatm
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingSlideshare
 
Data pre processing
Data pre processingData pre processing
Data pre processingjunnubabu
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingAmuthamca
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessingSalah Amean
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
1.6.data preprocessing
1.6.data preprocessing1.6.data preprocessing
1.6.data preprocessingKrish_ver2
 

What's hot (17)

Data Mining: Data Preprocessing
Data Mining: Data PreprocessingData Mining: Data Preprocessing
Data Mining: Data Preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
03 preprocessing
03 preprocessing03 preprocessing
03 preprocessing
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
1.6.data preprocessing
1.6.data preprocessing1.6.data preprocessing
1.6.data preprocessing
 

Viewers also liked

Theory & Practice of Data Cleaning: Introduction to OpenRefine
Theory & Practice of Data Cleaning: Introduction to OpenRefineTheory & Practice of Data Cleaning: Introduction to OpenRefine
Theory & Practice of Data Cleaning: Introduction to OpenRefineBertram Ludäscher
 
Comprehensive Validation with Laravel 4
Comprehensive Validation with Laravel 4Comprehensive Validation with Laravel 4
Comprehensive Validation with Laravel 4Kirk Bushell
 
HOW TO PROCESS DATA IN VARIOUS GEO'S A COMPARATIVE ANALYSIS BY SANJEEV SINGH...
HOW TO PROCESS DATA IN VARIOUS GEO'S A  COMPARATIVE ANALYSIS BY SANJEEV SINGH...HOW TO PROCESS DATA IN VARIOUS GEO'S A  COMPARATIVE ANALYSIS BY SANJEEV SINGH...
HOW TO PROCESS DATA IN VARIOUS GEO'S A COMPARATIVE ANALYSIS BY SANJEEV SINGH...Sanjeev Bharwan
 
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)Sanjeev Bharwan
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Matteo Manca
 
DataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataDataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataRitvvij Parrikh
 
Fatimmatuz zahro eng3a_03_presentperfectcontinous
Fatimmatuz zahro eng3a_03_presentperfectcontinousFatimmatuz zahro eng3a_03_presentperfectcontinous
Fatimmatuz zahro eng3a_03_presentperfectcontinousfatimmatuzzahro
 
Bba203 unit 2data processing concepts
Bba203   unit 2data processing conceptsBba203   unit 2data processing concepts
Bba203 unit 2data processing conceptskinjal patel
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingvenkadesh236
 
Overview of DATA PREPROCESS..
Overview of DATA PREPROCESS..Overview of DATA PREPROCESS..
Overview of DATA PREPROCESS..killerkarthic
 
Spss basic1
Spss basic1Spss basic1
Spss basic1UPM
 
Introduction to spss – part 1
Introduction to spss – part 1Introduction to spss – part 1
Introduction to spss – part 1Dr. Vignes Gopal
 
Descriptives & Graphing
Descriptives & GraphingDescriptives & Graphing
Descriptives & GraphingJames Neill
 
BID CE workshop 1 session 08 - Biodiversity Data Cleaning
BID CE workshop 1   session 08 - Biodiversity Data CleaningBID CE workshop 1   session 08 - Biodiversity Data Cleaning
BID CE workshop 1 session 08 - Biodiversity Data CleaningAlberto González-Talaván
 

Viewers also liked (20)

Theory & Practice of Data Cleaning: Introduction to OpenRefine
Theory & Practice of Data Cleaning: Introduction to OpenRefineTheory & Practice of Data Cleaning: Introduction to OpenRefine
Theory & Practice of Data Cleaning: Introduction to OpenRefine
 
Comprehensive Validation with Laravel 4
Comprehensive Validation with Laravel 4Comprehensive Validation with Laravel 4
Comprehensive Validation with Laravel 4
 
HOW TO PROCESS DATA IN VARIOUS GEO'S A COMPARATIVE ANALYSIS BY SANJEEV SINGH...
HOW TO PROCESS DATA IN VARIOUS GEO'S A  COMPARATIVE ANALYSIS BY SANJEEV SINGH...HOW TO PROCESS DATA IN VARIOUS GEO'S A  COMPARATIVE ANALYSIS BY SANJEEV SINGH...
HOW TO PROCESS DATA IN VARIOUS GEO'S A COMPARATIVE ANALYSIS BY SANJEEV SINGH...
 
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)
HIPPA COMPLIANCE (SANJEEV.S.BHARWAN)
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning Introduction to data pre-processing and cleaning
Introduction to data pre-processing and cleaning
 
Data Cleaning
Data CleaningData Cleaning
Data Cleaning
 
DataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census dataDataMeet 4: Data cleaning & census data
DataMeet 4: Data cleaning & census data
 
Fatimmatuz zahro eng3a_03_presentperfectcontinous
Fatimmatuz zahro eng3a_03_presentperfectcontinousFatimmatuz zahro eng3a_03_presentperfectcontinous
Fatimmatuz zahro eng3a_03_presentperfectcontinous
 
Bba203 unit 2data processing concepts
Bba203   unit 2data processing conceptsBba203   unit 2data processing concepts
Bba203 unit 2data processing concepts
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Overview of DATA PREPROCESS..
Overview of DATA PREPROCESS..Overview of DATA PREPROCESS..
Overview of DATA PREPROCESS..
 
Spss basic1
Spss basic1Spss basic1
Spss basic1
 
Preprocess
PreprocessPreprocess
Preprocess
 
Introduction to spss – part 1
Introduction to spss – part 1Introduction to spss – part 1
Introduction to spss – part 1
 
4 preprocess
4 preprocess4 preprocess
4 preprocess
 
Descriptive statistics ii
Descriptive statistics iiDescriptive statistics ii
Descriptive statistics ii
 
Descriptives & Graphing
Descriptives & GraphingDescriptives & Graphing
Descriptives & Graphing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
BID CE workshop 1 session 08 - Biodiversity Data Cleaning
BID CE workshop 1   session 08 - Biodiversity Data CleaningBID CE workshop 1   session 08 - Biodiversity Data Cleaning
BID CE workshop 1 session 08 - Biodiversity Data Cleaning
 

Similar to data warehousing & minining 1st unit

Similar to data warehousing & minining 1st unit (20)

Data preperation
Data preperationData preperation
Data preperation
 
Data preperation
Data preperationData preperation
Data preperation
 
Data preperation
Data preperationData preperation
Data preperation
 
Data preparation
Data preparationData preparation
Data preparation
 
Data preparation
Data preparationData preparation
Data preparation
 
prvg4sczsginx3ynyqlc-signature-b84f0cf1da1e7d0fde4ecfab2a28f243cfa561f9aa2c9b...
prvg4sczsginx3ynyqlc-signature-b84f0cf1da1e7d0fde4ecfab2a28f243cfa561f9aa2c9b...prvg4sczsginx3ynyqlc-signature-b84f0cf1da1e7d0fde4ecfab2a28f243cfa561f9aa2c9b...
prvg4sczsginx3ynyqlc-signature-b84f0cf1da1e7d0fde4ecfab2a28f243cfa561f9aa2c9b...
 
Data preparation
Data preparationData preparation
Data preparation
 
Data preparation
Data preparationData preparation
Data preparation
 
Data Mining
Data MiningData Mining
Data Mining
 
Datapreprocessingppt
DatapreprocessingpptDatapreprocessingppt
Datapreprocessingppt
 
Data processing
Data processingData processing
Data processing
 
Data preprocessing ng
Data preprocessing   ngData preprocessing   ng
Data preprocessing ng
 
Datapreprocessing
DatapreprocessingDatapreprocessing
Datapreprocessing
 
Datapreprocess
DatapreprocessDatapreprocess
Datapreprocess
 
Datapreprocessing
DatapreprocessingDatapreprocessing
Datapreprocessing
 
3prep
3prep3prep
3prep
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 

Recently uploaded

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 

Recently uploaded (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 

data warehousing & minining 1st unit

  • 1.
  • 3.
  • 4.
  • 5.
  • 6. Forms of data preprocessing Lecture-13 Why Data Preprocessing?
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. Cluster Analysis Lecture-14 - Data cleaning
  • 17. Regression x y y = x + 1 X1 Y1 Y1’ Lecture-14 - Data cleaning
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. X1 X2 Y1 Y2 Principal Component Analysis Lecture-16 - Data reduction
  • 33.
  • 34.
  • 35. Example of Decision Tree Induction Initial attribute set: {A1, A2, A3, A4, A5, A6} A4 ? A1? A6? Class 1 Class 2 Class 1 Class 2 Reduced attribute set: {A1, A4, A6} Lecture-16 - Data reduction >
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42. Sampling SRSWOR (simple random sample without replacement) SRSWR Lecture-16 - Data reduction Raw Data
  • 43. Sampling Raw Data Cluster/Stratified Sample Lecture-16 - Data reduction
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Example of 3-4-5 rule (-$4000 -$5,000) Step 4: Lecture-17 - Discretization and concept hierarchy generation (-$400 - 0) (-$400 - -$300) (-$300 - -$200) (-$200 - -$100) (-$100 - 0) (0 - $1,000) (0 - $200) ($200 - $400) ($400 - $600) ($600 - $800) ($800 - $1,000) ($2,000 - $5, 000) ($2,000 - $3,000) ($3,000 - $4,000) ($4,000 - $5,000) ($1,000 - $2, 000) ($1,000 - $1,200) ($1,200 - $1,400) ($1,400 - $1,600) ($1,600 - $1,800) ($1,800 - $2,000) msd=1,000 Low=-$1,000 High=$2,000 Step 2: Step 1: -$351 -$159 profit $1,838 $4,700 Min Low (i.e, 5%-tile) High(i.e, 95%-0 tile) Max count (-$1,000 - $2,000) (-$1,000 - 0) (0 -$ 1,000) Step 3: ($1,000 - $2,000)
  • 51.
  • 52.