SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Introduction to Data Mining
Mahmoud Rafeek Alfarra
http://mfarra.cst.ps
University College of Science & Technology- Khan yonis
Development of computer systems
2016
Chapter 1 – Lecture 3
Outline
 Definition of Data Mining
 Data Mining as an Interdisciplinary field
 Process of Data Mining
 Data Mining Tasks
 Challenges of Data Mining
 Data mining application examples
 Introduction to RapidMiner
Data Mining Tasks
 Data mining tasks are the kind of data
patterns that can be mined.
 Data Mining functionalities are used to
specify the kind of patterns to be found in the
data mining tasks.
 In general data mining tasks can be classified into
two categories:
Descriptive mining tasks characterize the general
properties of the data.
Predictive mining tasks perform inferences on the current
data in order to make predictions.
Data Mining Tasks
 Most famous data mining tasks:
 Classification [Predictive]
Prediction [Predictive]
Association Rules [Descriptive]
Clustering [Descriptive]
Outlier Analysis [Descriptive]
Data Mining Tasks
Classification
 Classification is used for predictive mining tasks.
 The input data for predictive modeling consists of
two types of variables:
Explanatory variables, which define the essential properties of
the data.
 Target variables , whose values are to be predicted.
 Classification is used to predicate the value of
discrete target variable.
Classification
Prediction
 Similar to classification, except we are trying to predict
the value of a variable (e.g. amount of purchase),
rather than a class (e.g. purchaser or non-purchaser).
Association
 Association Rules aims to find out the relationship
among valuables in database, resulting in deferent types
of rules.
 Seek to produce a set of rules describing the set of
features that are strongly related to each others.
Association
Gender Age Smoker LAD% RCA%
F 52 Y 85 100
M 62 N 80 0
M 75 Y 70 80
M 73 Y 40 99
M 66 N 50 45
… … … … …
 LAD%- The percentage of heat disease caused by left anterior descending coronary artery.
 RCA%- The percentage of heat disease caused by right coronary artery.
Original data from a research on heart disease
Association
Medical Association Rules
NO. Rule
1 Gender=M∩Age≥70∩Smoker=YRCA%≥50(40%,100%)
2 Gender=F∩Age<70∩Smoker=YLAD%≥70(20%,100%)
 Rule 1 indicates:40% of the cases are male, over 70 years old and have the habit of
smoking, the possibility of RCA%≥50% is 100%
 Rule 2 indicates:20% of the cases are female, under 70 years old and have the habit
of smoking, the possibility of LAD%≥70% is 100%
Clustering
 Finds groups of data pointes (clusters) so that data
points that belong to one cluster are more similar to
each other than to data points belonging to different
cluster.
Clustering
Document Clustering:
 Goal: To find groups of documents that are similar to each
other based on the important terms appearing in them.
 Approach: To identify frequently occurring terms in each
document. Form a similarity measure based on the frequencies
of different terms. Use it to cluster.
 Gain: Information Retrieval can utilize the clusters to relate a
new document or search term to clustered documents.
Outlier Analysis
 Discovers data points that are significantly different
than the rest of the data. Such points are known as
anomalies or outliers.
Outline
Definition of Data Mining
Data Mining as an Interdisciplinary field
Process of Data Mining
Data Mining Tasks
Challenges of Data Mining
Data mining application examples
Introduction to RapidMiner
Challenges of Data Mining
Scalability: Scalable techniques are needed
to handle the massive scale of data.
Dimensionality: Many applications may
involves a large number of dimensions (e.g.
features or attributes of data)
Challenges of Data Mining
Heterogeneous and Complex Data: In recent years
complicated data types such as graph-based, text-free
and structured data types are introduced. Techniques
developed for data mining must be able to handle the
heterogeneity of the data.
Challenges of Data Mining
Data Quality: Many data sets are imperfect due to
present of missing values and noise un the data. To
handle the imperfection, robust data mining algorithms
must be developed.
Challenges of Data Mining
Data Distribution: As the volume of data increases , it
is no longer possible or safe to keep all the data in the
same place. As a result, the need for distributed data
mining techniques has increased over the years.
Challenges of Data Mining
Privacy Preservation: While privacy intends to prevent
the disclosure of information, data mining attempts to
revel interesting knowledge about data. As a result,
there is growing interest in developing privacy-
preserving data mining algorithms.
Outline
Definition of Data Mining
Data Mining as an Interdisciplinary field
Process of Data Mining
Data Mining Tasks
Challenges of Data Mining
Data mining application examples
Introduction to RapidMine
Data mining application
Science
astronomy, bioinformatics, drug discovery, …
Business
advertising, CRM (Customer Relationship management),
investments, manufacturing, sports/entertainment, telecom, e-
Commerce, targeted marketing, health care, …
Web
search engines, web mining,…
Government
law enforcement, profiling tax cheaters,

Weitere ähnliche Inhalte

Was ist angesagt?

Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysisKrish_ver2
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Mustafa Sherazi
 
Security auditing architecture
Security auditing architectureSecurity auditing architecture
Security auditing architectureVishnupriya T H
 
Dm from databases perspective u 1
Dm from databases perspective u 1Dm from databases perspective u 1
Dm from databases perspective u 1sakthyvel3
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataSalah Amean
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretizationKrish_ver2
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesKrish_ver2
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
similarity measure
similarity measure similarity measure
similarity measure ZHAO Sam
 

Was ist angesagt? (20)

Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Security auditing architecture
Security auditing architectureSecurity auditing architecture
Security auditing architecture
 
Dm from databases perspective u 1
Dm from databases perspective u 1Dm from databases perspective u 1
Dm from databases perspective u 1
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
Fp growth
Fp growthFp growth
Fp growth
 
similarity measure
similarity measure similarity measure
similarity measure
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 

Ähnlich wie 3 Data Mining Tasks

Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptPadmajaLaksh
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.pptadmsoyadm4
 
Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1DanWooster1
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1Mahmoud Alfarra
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysisPoonam Kshirsagar
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptSangrangBargayary3
 
Introduction to dm and dw
Introduction to dm and dwIntroduction to dm and dw
Introduction to dm and dwANUSUYA T K
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 

Ähnlich wie 3 Data Mining Tasks (20)

Chapter 1. Introduction.ppt
Chapter 1. Introduction.pptChapter 1. Introduction.ppt
Chapter 1. Introduction.ppt
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining Intro
 
data mining
data miningdata mining
data mining
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt01Introduction to data mining chapter 1.ppt
01Introduction to data mining chapter 1.ppt
 
01Intro.ppt
01Intro.ppt01Intro.ppt
01Intro.ppt
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1Upstate CSCI 525 Data Mining Chapter 1
Upstate CSCI 525 Data Mining Chapter 1
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
data mining
data miningdata mining
data mining
 
G045033841
G045033841G045033841
G045033841
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
NCCT.pptx
NCCT.pptxNCCT.pptx
NCCT.pptx
 
Introduction to dm and dw
Introduction to dm and dwIntroduction to dm and dw
Introduction to dm and dw
 
DOWLD SLIDES.pptx
DOWLD SLIDES.pptxDOWLD SLIDES.pptx
DOWLD SLIDES.pptx
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 

Mehr von Mahmoud Alfarra

Computer Programming, Loops using Java - part 2
Computer Programming, Loops using Java - part 2Computer Programming, Loops using Java - part 2
Computer Programming, Loops using Java - part 2Mahmoud Alfarra
 
Computer Programming, Loops using Java
Computer Programming, Loops using JavaComputer Programming, Loops using Java
Computer Programming, Loops using JavaMahmoud Alfarra
 
Chapter 10: hashing data structure
Chapter 10:  hashing data structureChapter 10:  hashing data structure
Chapter 10: hashing data structureMahmoud Alfarra
 
Chapter9 graph data structure
Chapter9  graph data structureChapter9  graph data structure
Chapter9 graph data structureMahmoud Alfarra
 
Chapter 8: tree data structure
Chapter 8:  tree data structureChapter 8:  tree data structure
Chapter 8: tree data structureMahmoud Alfarra
 
Chapter 7: Queue data structure
Chapter 7:  Queue data structureChapter 7:  Queue data structure
Chapter 7: Queue data structureMahmoud Alfarra
 
Chapter 6: stack data structure
Chapter 6:  stack data structureChapter 6:  stack data structure
Chapter 6: stack data structureMahmoud Alfarra
 
Chapter 5: linked list data structure
Chapter 5: linked list data structureChapter 5: linked list data structure
Chapter 5: linked list data structureMahmoud Alfarra
 
Chapter 4: basic search algorithms data structure
Chapter 4: basic search algorithms data structureChapter 4: basic search algorithms data structure
Chapter 4: basic search algorithms data structureMahmoud Alfarra
 
Chapter 3: basic sorting algorithms data structure
Chapter 3: basic sorting algorithms data structureChapter 3: basic sorting algorithms data structure
Chapter 3: basic sorting algorithms data structureMahmoud Alfarra
 
Chapter 2: array and array list data structure
Chapter 2: array and array list  data structureChapter 2: array and array list  data structure
Chapter 2: array and array list data structureMahmoud Alfarra
 
Chapter1 intro toprincipleofc#_datastructure_b_cs
Chapter1  intro toprincipleofc#_datastructure_b_csChapter1  intro toprincipleofc#_datastructure_b_cs
Chapter1 intro toprincipleofc#_datastructure_b_csMahmoud Alfarra
 
Chapter 0: introduction to data structure
Chapter 0: introduction to data structureChapter 0: introduction to data structure
Chapter 0: introduction to data structureMahmoud Alfarra
 
8 programming-using-java decision-making practices 20102011
8 programming-using-java decision-making practices 201020118 programming-using-java decision-making practices 20102011
8 programming-using-java decision-making practices 20102011Mahmoud Alfarra
 
7 programming-using-java decision-making220102011
7 programming-using-java decision-making2201020117 programming-using-java decision-making220102011
7 programming-using-java decision-making220102011Mahmoud Alfarra
 
6 programming-using-java decision-making20102011-
6 programming-using-java decision-making20102011-6 programming-using-java decision-making20102011-
6 programming-using-java decision-making20102011-Mahmoud Alfarra
 
5 programming-using-java intro-tooop20102011
5 programming-using-java intro-tooop201020115 programming-using-java intro-tooop20102011
5 programming-using-java intro-tooop20102011Mahmoud Alfarra
 
4 programming-using-java intro-tojava20102011
4 programming-using-java intro-tojava201020114 programming-using-java intro-tojava20102011
4 programming-using-java intro-tojava20102011Mahmoud Alfarra
 
3 programming-using-java introduction-to computer
3 programming-using-java introduction-to computer3 programming-using-java introduction-to computer
3 programming-using-java introduction-to computerMahmoud Alfarra
 

Mehr von Mahmoud Alfarra (20)

Computer Programming, Loops using Java - part 2
Computer Programming, Loops using Java - part 2Computer Programming, Loops using Java - part 2
Computer Programming, Loops using Java - part 2
 
Computer Programming, Loops using Java
Computer Programming, Loops using JavaComputer Programming, Loops using Java
Computer Programming, Loops using Java
 
Chapter 10: hashing data structure
Chapter 10:  hashing data structureChapter 10:  hashing data structure
Chapter 10: hashing data structure
 
Chapter9 graph data structure
Chapter9  graph data structureChapter9  graph data structure
Chapter9 graph data structure
 
Chapter 8: tree data structure
Chapter 8:  tree data structureChapter 8:  tree data structure
Chapter 8: tree data structure
 
Chapter 7: Queue data structure
Chapter 7:  Queue data structureChapter 7:  Queue data structure
Chapter 7: Queue data structure
 
Chapter 6: stack data structure
Chapter 6:  stack data structureChapter 6:  stack data structure
Chapter 6: stack data structure
 
Chapter 5: linked list data structure
Chapter 5: linked list data structureChapter 5: linked list data structure
Chapter 5: linked list data structure
 
Chapter 4: basic search algorithms data structure
Chapter 4: basic search algorithms data structureChapter 4: basic search algorithms data structure
Chapter 4: basic search algorithms data structure
 
Chapter 3: basic sorting algorithms data structure
Chapter 3: basic sorting algorithms data structureChapter 3: basic sorting algorithms data structure
Chapter 3: basic sorting algorithms data structure
 
Chapter 2: array and array list data structure
Chapter 2: array and array list  data structureChapter 2: array and array list  data structure
Chapter 2: array and array list data structure
 
Chapter1 intro toprincipleofc#_datastructure_b_cs
Chapter1  intro toprincipleofc#_datastructure_b_csChapter1  intro toprincipleofc#_datastructure_b_cs
Chapter1 intro toprincipleofc#_datastructure_b_cs
 
Chapter 0: introduction to data structure
Chapter 0: introduction to data structureChapter 0: introduction to data structure
Chapter 0: introduction to data structure
 
3 classification
3  classification3  classification
3 classification
 
8 programming-using-java decision-making practices 20102011
8 programming-using-java decision-making practices 201020118 programming-using-java decision-making practices 20102011
8 programming-using-java decision-making practices 20102011
 
7 programming-using-java decision-making220102011
7 programming-using-java decision-making2201020117 programming-using-java decision-making220102011
7 programming-using-java decision-making220102011
 
6 programming-using-java decision-making20102011-
6 programming-using-java decision-making20102011-6 programming-using-java decision-making20102011-
6 programming-using-java decision-making20102011-
 
5 programming-using-java intro-tooop20102011
5 programming-using-java intro-tooop201020115 programming-using-java intro-tooop20102011
5 programming-using-java intro-tooop20102011
 
4 programming-using-java intro-tojava20102011
4 programming-using-java intro-tojava201020114 programming-using-java intro-tojava20102011
4 programming-using-java intro-tojava20102011
 
3 programming-using-java introduction-to computer
3 programming-using-java introduction-to computer3 programming-using-java introduction-to computer
3 programming-using-java introduction-to computer
 

Kürzlich hochgeladen

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 

Kürzlich hochgeladen (20)

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 

3 Data Mining Tasks

  • 1. Introduction to Data Mining Mahmoud Rafeek Alfarra http://mfarra.cst.ps University College of Science & Technology- Khan yonis Development of computer systems 2016 Chapter 1 – Lecture 3
  • 2. Outline  Definition of Data Mining  Data Mining as an Interdisciplinary field  Process of Data Mining  Data Mining Tasks  Challenges of Data Mining  Data mining application examples  Introduction to RapidMiner
  • 3. Data Mining Tasks  Data mining tasks are the kind of data patterns that can be mined.  Data Mining functionalities are used to specify the kind of patterns to be found in the data mining tasks.
  • 4.  In general data mining tasks can be classified into two categories: Descriptive mining tasks characterize the general properties of the data. Predictive mining tasks perform inferences on the current data in order to make predictions. Data Mining Tasks
  • 5.  Most famous data mining tasks:  Classification [Predictive] Prediction [Predictive] Association Rules [Descriptive] Clustering [Descriptive] Outlier Analysis [Descriptive] Data Mining Tasks
  • 6. Classification  Classification is used for predictive mining tasks.  The input data for predictive modeling consists of two types of variables: Explanatory variables, which define the essential properties of the data.  Target variables , whose values are to be predicted.  Classification is used to predicate the value of discrete target variable.
  • 8. Prediction  Similar to classification, except we are trying to predict the value of a variable (e.g. amount of purchase), rather than a class (e.g. purchaser or non-purchaser).
  • 9. Association  Association Rules aims to find out the relationship among valuables in database, resulting in deferent types of rules.  Seek to produce a set of rules describing the set of features that are strongly related to each others.
  • 10. Association Gender Age Smoker LAD% RCA% F 52 Y 85 100 M 62 N 80 0 M 75 Y 70 80 M 73 Y 40 99 M 66 N 50 45 … … … … …  LAD%- The percentage of heat disease caused by left anterior descending coronary artery.  RCA%- The percentage of heat disease caused by right coronary artery. Original data from a research on heart disease
  • 11. Association Medical Association Rules NO. Rule 1 Gender=M∩Age≥70∩Smoker=YRCA%≥50(40%,100%) 2 Gender=F∩Age<70∩Smoker=YLAD%≥70(20%,100%)  Rule 1 indicates:40% of the cases are male, over 70 years old and have the habit of smoking, the possibility of RCA%≥50% is 100%  Rule 2 indicates:20% of the cases are female, under 70 years old and have the habit of smoking, the possibility of LAD%≥70% is 100%
  • 12. Clustering  Finds groups of data pointes (clusters) so that data points that belong to one cluster are more similar to each other than to data points belonging to different cluster.
  • 13. Clustering Document Clustering:  Goal: To find groups of documents that are similar to each other based on the important terms appearing in them.  Approach: To identify frequently occurring terms in each document. Form a similarity measure based on the frequencies of different terms. Use it to cluster.  Gain: Information Retrieval can utilize the clusters to relate a new document or search term to clustered documents.
  • 14. Outlier Analysis  Discovers data points that are significantly different than the rest of the data. Such points are known as anomalies or outliers.
  • 15. Outline Definition of Data Mining Data Mining as an Interdisciplinary field Process of Data Mining Data Mining Tasks Challenges of Data Mining Data mining application examples Introduction to RapidMiner
  • 16. Challenges of Data Mining Scalability: Scalable techniques are needed to handle the massive scale of data. Dimensionality: Many applications may involves a large number of dimensions (e.g. features or attributes of data)
  • 17. Challenges of Data Mining Heterogeneous and Complex Data: In recent years complicated data types such as graph-based, text-free and structured data types are introduced. Techniques developed for data mining must be able to handle the heterogeneity of the data.
  • 18. Challenges of Data Mining Data Quality: Many data sets are imperfect due to present of missing values and noise un the data. To handle the imperfection, robust data mining algorithms must be developed.
  • 19. Challenges of Data Mining Data Distribution: As the volume of data increases , it is no longer possible or safe to keep all the data in the same place. As a result, the need for distributed data mining techniques has increased over the years.
  • 20. Challenges of Data Mining Privacy Preservation: While privacy intends to prevent the disclosure of information, data mining attempts to revel interesting knowledge about data. As a result, there is growing interest in developing privacy- preserving data mining algorithms.
  • 21. Outline Definition of Data Mining Data Mining as an Interdisciplinary field Process of Data Mining Data Mining Tasks Challenges of Data Mining Data mining application examples Introduction to RapidMine
  • 22. Data mining application Science astronomy, bioinformatics, drug discovery, … Business advertising, CRM (Customer Relationship management), investments, manufacturing, sports/entertainment, telecom, e- Commerce, targeted marketing, health care, … Web search engines, web mining,… Government law enforcement, profiling tax cheaters,