SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Data Processing,[object Object]
What is the need for Data Processing?,[object Object],To get the required information from huge, incomplete, noisy and inconsistent set of data it is necessary to use data processing.,[object Object]
Steps in Data Processing,[object Object],Data Cleaning,[object Object],Data Integration,[object Object],Data Transformation,[object Object],Data reduction,[object Object],Data Summarization,[object Object]
What is Data Cleaning?,[object Object],Data cleaning is a procedure to “clean” the data by filling in missing values, smoothing noisy data, identifying or removing outliers, and resolving inconsistencies,[object Object]
What is Data Integration?,[object Object],Integrating multiple databases, data cubes, or files, this is called data integration.,[object Object]
What is Data Transformation?,[object Object],Data transformation operations, such as normalization and aggregation, are additional data preprocessing procedures that would contribute toward the success of the mining process.,[object Object]
What is Data Reduction?,[object Object],Data reduction obtains a reduced representation of the data set that is much smaller in volume, yet produces the same (or almost the same) analytical results.,[object Object]
What is Data Summarization?,[object Object],It is the processes of representing the collected data in an accurate and compact way without losing any information, it also involves getting a information from collected data.,[object Object],Ex: Display the data as a graph and get the mean, median, mode etc.,[object Object]
How to Clean Data?,[object Object],Handling Missing values,[object Object],Ignore the tuple,[object Object],Fill in the missing value manually,[object Object],Use a global constant to fill in the missing value,[object Object],Use the attribute mean to fill in the missing value,[object Object],Use the attribute mean for all samples belonging to the same class as the given tuple,[object Object],Use the most probable value to fill in the missing value,[object Object]
How to Clean Data?,[object Object],Handle Noisy Data,[object Object],Binning: Binning methods smooth a sorted data value by consulting its “neighborhood”.,[object Object],Regression: Data can be smoothed by fitting the data to a function, such as with regression. ,[object Object],Clustering: Outliers may be detected by clustering, where similar values are organized into groups, or “clusters.”,[object Object]
Data Integration,[object Object],Data Integration combines data from multiple sources into a coherent data store, as in data warehousing. These sources may include multiple databases, data cubes, or flat files. Issues that arises during data integration like Schema integration and object matching Redundancy is another important issue.,[object Object]
Data Transformation,[object Object],Data transformation can be achieved in following ways,[object Object],Smoothing: which works to remove noise from the data,[object Object],Aggregation: where summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated so as to compute weekly and annuual total scores.,[object Object],Generalization of the data: where low-level or “primitive” (raw) data are replaced by higher-level concepts through the use of concept hierarchies. For example, categorical attributes, like street, can be generalized to higher-level concepts, like city or country.,[object Object],Normalization: where the attribute data are scaled so as to fall within a small specified range, such as −1.0 to 1.0, or 0.0 to 1.0.,[object Object],Attribute construction : this is where new attributes are constructed and added from the given set of attributes to help the mining process.,[object Object]
Data Reduction techniques,[object Object],These are the techniques that can be applied to obtain a reduced representation of the data set that is much smaller in volume, yet closely maintains the integrity of the original data.,[object Object],Data cube aggregation,[object Object],Attribute subset selection,[object Object],Dimensionality reduction,[object Object],Numerosity reduction,[object Object],Discretization and concept hierarchy generation,[object Object]
Visit more self help tutorials,[object Object],Pick a tutorial of your choice and browse through it at your own pace.,[object Object],The tutorials section is free, self-guiding and will not involve any additional support.,[object Object],Visit us at www.dataminingtools.net,[object Object]

Weitere ähnliche Inhalte

Was ist angesagt?

03. Data Exploration.pptx
03. Data Exploration.pptx03. Data Exploration.pptx
03. Data Exploration.pptxSarojkumari55
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization janani thirupathi
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data miningkavitha muneeshwaran
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data MiningIffat Firozy
 

Was ist angesagt? (20)

Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
03. Data Exploration.pptx
03. Data Exploration.pptx03. Data Exploration.pptx
03. Data Exploration.pptx
 
data mining
data miningdata mining
data mining
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data mining
Data miningData mining
Data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Preparation.pptx
Data Preparation.pptxData Preparation.pptx
Data Preparation.pptx
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
 
Kdd process
Kdd processKdd process
Kdd process
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data mining
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 

Ähnlich wie Data Mining: Data processing

Ähnlich wie Data Mining: Data processing (20)

Data preprocessing ng
Data preprocessing   ngData preprocessing   ng
Data preprocessing ng
 
Data preprocessing ng
Data preprocessing   ngData preprocessing   ng
Data preprocessing ng
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
1234
12341234
1234
 
Preprocessing.ppt
Preprocessing.pptPreprocessing.ppt
Preprocessing.ppt
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data1
Data1Data1
Data1
 
Data1
Data1Data1
Data1
 
Preprocess
PreprocessPreprocess
Preprocess
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17
 
Datapreprocessingppt
DatapreprocessingpptDatapreprocessingppt
Datapreprocessingppt
 

Mehr von DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 

Mehr von DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 

Kürzlich hochgeladen

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseZilliz
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Elevate Your Business with TECUNIQUE's Tailored Solutions
Elevate Your Business with TECUNIQUE's Tailored SolutionsElevate Your Business with TECUNIQUE's Tailored Solutions
Elevate Your Business with TECUNIQUE's Tailored SolutionsJaydeep Chhasatia
 
Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Anchore
 
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...James Anderson
 
AI-based audio transcription solutions (IDP)
AI-based audio transcription solutions (IDP)AI-based audio transcription solutions (IDP)
AI-based audio transcription solutions (IDP)KapilVaidya4
 
Dev Dives: Master advanced authentication and performance in Productivity Act...
Dev Dives: Master advanced authentication and performance in Productivity Act...Dev Dives: Master advanced authentication and performance in Productivity Act...
Dev Dives: Master advanced authentication and performance in Productivity Act...UiPathCommunity
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxNoman khan
 
5 Considerations For Choosing The Best Gutter Guards
5 Considerations For Choosing The Best Gutter Guards5 Considerations For Choosing The Best Gutter Guards
5 Considerations For Choosing The Best Gutter GuardsCPR Gutter Protection
 
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHampshireHUG
 
LinkedIn optimization Gunjan Dhir .pptx
LinkedIn optimization Gunjan Dhir .pptxLinkedIn optimization Gunjan Dhir .pptx
LinkedIn optimization Gunjan Dhir .pptxGunjan Dhir
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
Retrofitting for the Built Environment - IES
Retrofitting for the Built Environment - IESRetrofitting for the Built Environment - IES
Retrofitting for the Built Environment - IESIES VE
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxChimezie Ogbuji
 
Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??N.K KooZN
 
ServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxshyamraj55
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 

Kürzlich hochgeladen (20)

Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Introducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 releaseIntroducing Milvus and new features in 2.4 release
Introducing Milvus and new features in 2.4 release
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Elevate Your Business with TECUNIQUE's Tailored Solutions
Elevate Your Business with TECUNIQUE's Tailored SolutionsElevate Your Business with TECUNIQUE's Tailored Solutions
Elevate Your Business with TECUNIQUE's Tailored Solutions
 
Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)Tracking license compliance made easy - intro to Grant (OSS)
Tracking license compliance made easy - intro to Grant (OSS)
 
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
 
AI-based audio transcription solutions (IDP)
AI-based audio transcription solutions (IDP)AI-based audio transcription solutions (IDP)
AI-based audio transcription solutions (IDP)
 
Dev Dives: Master advanced authentication and performance in Productivity Act...
Dev Dives: Master advanced authentication and performance in Productivity Act...Dev Dives: Master advanced authentication and performance in Productivity Act...
Dev Dives: Master advanced authentication and performance in Productivity Act...
 
Checklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docxChecklist to troubleshoot CD moisture profiles.docx
Checklist to troubleshoot CD moisture profiles.docx
 
5 Considerations For Choosing The Best Gutter Guards
5 Considerations For Choosing The Best Gutter Guards5 Considerations For Choosing The Best Gutter Guards
5 Considerations For Choosing The Best Gutter Guards
 
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptxHHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
HHUG-03-2024-Impactful-Reporting-in-HubSpot.pptx
 
LinkedIn optimization Gunjan Dhir .pptx
LinkedIn optimization Gunjan Dhir .pptxLinkedIn optimization Gunjan Dhir .pptx
LinkedIn optimization Gunjan Dhir .pptx
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
Retrofitting for the Built Environment - IES
Retrofitting for the Built Environment - IESRetrofitting for the Built Environment - IES
Retrofitting for the Built Environment - IES
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??Does AI(Artificial intelligence) need a Working Memory??
Does AI(Artificial intelligence) need a Working Memory??
 
ServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptxServiceNow Integration with MuleSoft.pptx
ServiceNow Integration with MuleSoft.pptx
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 

Data Mining: Data processing

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.