SlideShare a Scribd company logo
1 of 17
EVALUATION AND VISUALIZATION OF
DIFFERENT DATA MINING TECHNIQUES

INTRODUCTION TO DATAMINING BY SUMAIRA S.
Data Mining Process

INTRODUCTION TO DATAMINING BY SUMAIRA S.
The purpose of this project is to gain an understanding of the process of data
mining by


Implementing one or more data mining algorithms



Visualizing them



Comparing their performance on datasets



Another aspect was to provide visual tutorials and detailed help about these algorithms

INTRODUCTION TO DATAMINING BY SUMAIRA S.
WHAT IS DATA MINING?
 Originally developed to act as expert systems to solve problems
 Data Mining can be utilized in any organization that needs to find

patterns or relationships in their data.
 Different types of Data Mining

INTRODUCTION TO DATAMINING BY SUMAIRA S.
BASIC FEATURES OF THE PROJECT
 Handling different types of data
 Pre processing of data
 Algorithms implementation
 Visualization of data mining model
 Comparison of different data mining algorithms
 Help and visual tutorials

INTRODUCTION TO DATAMINING BY SUMAIRA S.
HANDLING DIFFERENT DATA FORMATS
System supports following types of data files
 Text Data File Handling


CSV (Comma Separated Value) File



Any User Defined Format

 Database Data File Handling


MS Access Data File



MS SQL Data File

 XML Data File Handling


XML Data File

INTRODUCTION TO DATAMINING BY SUMAIRA S.
PRE PROCESSING OF DATA
 Pre processing of data includes


Filling of missing values


Ignore row

INTRODUCTION TO DATAMINING BY SUMAIRA S.
ALGORITHMS’ IMPLEMENTATION
 Clustering


Partitional Clustering Algorithm




K-Means Algorithm

Hierarchical Clustering Algorithms


Single Linkage Algorithm



Weighted Average Algorithm



Complete Linkage Algorithm

INTRODUCTION TO DATAMINING BY SUMAIRA S.
VISUALIZATION OF DATA MINING MODEL
 XYScatter Chart Visualization
 Dendrogram
 Pie Chart
 Curve Graph

INTRODUCTION TO DATAMINING BY SUMAIRA S.
COMPARISON OF DIFFERENT DATA
MINING ALGORITHMS
 Data File Comparison


Running time



Memory Usage



CPU Usage



Precision/Recall

INTRODUCTION TO DATAMINING BY SUMAIRA S.
K-MEAN ALGORITHM
 K-mean was introduced by MC Queen in 1967

INTRODUCTION TO DATAMINING BY SUMAIRA S.
THE K-MEANS CLUSTERING METHOD
10

5

6

5

6

7

6

7

8

7

8

9

8

9

10

9

10

5

4

4
3
2
1
0
0

1

2

3

4

5

6

7

8

9

10

Assign
each of
the
objects
to most
similar
center

3
2
1
0
0

1

2

3

4

5

6

7

8

9

10

Update
the
cluster
means

4
3
2
1
0
0

Arbitrarily choose K
objects as initial
cluster center

3

4

5

6

7

8

9

10

reassign

10

10

9

9
8

7

7

6

6

5

5

4
3
2
1
0
0

INTRODUCTION TO DATAMINING BY SUMAIRA S.

2

reassign

8

K=2

1

1

2

3

4

5

6

7

8

9

10

Update
the
cluster
means

4
3
2
1
0
0

1

2

3

4

5

6

7

8

9

10
SINGLE LINKAGE HIERARCHICAL CLUSTERING
1. Say “Every point is
its own cluster”
2. Find “most similar”
pair of clusters

INTRODUCTION TO DATAMINING BY SUMAIRA S.
SINGLE LINKAGE HIERARCHICAL CLUSTERING
1. Say “Every point is
its own cluster”
2. Find “most similar”
pair of clusters
3. Merge it into a
parent cluster

INTRODUCTION TO DATAMINING BY SUMAIRA S.
SINGLE LINKAGE HIERARCHICAL CLUSTERING
1. Say “Every point is
its own cluster”
2. Find “most similar”
pair of clusters
3. Merge it into a
parent cluster
4. Repeat

INTRODUCTION TO DATAMINING BY SUMAIRA S.
SINGLE LINKAGE HIERARCHICAL CLUSTERING
1. Say “Every point is
its own cluster”
2. Find “most similar”
pair of clusters
3. Merge it into a
parent cluster
4. Repeat

INTRODUCTION TO DATAMINING BY SUMAIRA S.
THANK YOU

Presentation By:

Sumaira Sohail.

INTRODUCTION TO DATAMINING BY SUMAIRA S.

More Related Content

What's hot

What's hot (20)

Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
02 Data Mining
02 Data Mining02 Data Mining
02 Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data mining
 
data warehousing & minining 1st unit
data warehousing & minining 1st unitdata warehousing & minining 1st unit
data warehousing & minining 1st unit
 
Data Reduction Stratergies
Data Reduction StratergiesData Reduction Stratergies
Data Reduction Stratergies
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
03. Data Preprocessing
03. Data Preprocessing03. Data Preprocessing
03. Data Preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Reduction
Data ReductionData Reduction
Data Reduction
 
Data Mining: Data Preprocessing
Data Mining: Data PreprocessingData Mining: Data Preprocessing
Data Mining: Data Preprocessing
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Data reduction
Data reductionData reduction
Data reduction
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
 

Viewers also liked

Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginnerexcel content
 
Ethics In DW & DM
Ethics In DW & DMEthics In DW & DM
Ethics In DW & DMabethan
 
Digital footprints& datamining
Digital footprints& dataminingDigital footprints& datamining
Digital footprints& dataminingPaige Jaeger
 
Data mining project presentation
Data mining project presentationData mining project presentation
Data mining project presentationKaiwen Qi
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseKartik Kalpande Patil
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
Knowledge Discovery in Databases
Knowledge Discovery in DatabasesKnowledge Discovery in Databases
Knowledge Discovery in DatabasesDiwas Kandel
 
Weka presentation
Weka presentationWeka presentation
Weka presentationSaeed Iqbal
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Krishna Petrochemicals
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 

Viewers also liked (20)

Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginner
 
Ethics In DW & DM
Ethics In DW & DMEthics In DW & DM
Ethics In DW & DM
 
Datamining with R
Datamining with RDatamining with R
Datamining with R
 
Digital footprints& datamining
Digital footprints& dataminingDigital footprints& datamining
Digital footprints& datamining
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
Data mining project presentation
Data mining project presentationData mining project presentation
Data mining project presentation
 
Datamining
DataminingDatamining
Datamining
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Knowledge Discovery in Databases
Knowledge Discovery in DatabasesKnowledge Discovery in Databases
Knowledge Discovery in Databases
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
Kdd process
Kdd processKdd process
Kdd process
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Data mining
Data miningData mining
Data mining
 

Similar to Introduction to Data Mining

Data mining query language
Data mining query languageData mining query language
Data mining query languageGowriLatha1
 
ELK Stack with Kibana _Course Content.pdf
ELK Stack with Kibana _Course Content.pdfELK Stack with Kibana _Course Content.pdf
ELK Stack with Kibana _Course Content.pdfMultisoft Systems
 
TUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsTUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsHong-Linh Truong
 
Data Mining 2008
Data Mining 2008Data Mining 2008
Data Mining 2008llangit
 
E132833
E132833E132833
E132833irjes
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Miningllangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Miningllangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Miningllangit
 
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey IJECEIAES
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Dean Willson
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining systemramya marichamy
 
MS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxMS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxsqlserver content
 
MS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxMS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxDataminingTools Inc
 
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET Journal
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisTeradata Aster
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...Álvaro Sicilia
 

Similar to Introduction to Data Mining (20)

Data mining query language
Data mining query languageData mining query language
Data mining query language
 
ELK Stack with Kibana _Course Content.pdf
ELK Stack with Kibana _Course Content.pdfELK Stack with Kibana _Course Content.pdf
ELK Stack with Kibana _Course Content.pdf
 
TUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflowsTUW - Quality of data-aware data analytics workflows
TUW - Quality of data-aware data analytics workflows
 
Data Mining 2008
Data Mining 2008Data Mining 2008
Data Mining 2008
 
E132833
E132833E132833
E132833
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey
Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey
 
Msbi online training
Msbi online trainingMsbi online training
Msbi online training
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining system
 
data-modeling-paper
data-modeling-paperdata-modeling-paper
data-modeling-paper
 
MS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxMS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmx
 
MS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxMS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmx
 
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
 
Mastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and AnalysisMastering MapReduce: MapReduce for Big Data Management and Analysis
Mastering MapReduce: MapReduce for Big Data Management and Analysis
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
Clementine tool
Clementine toolClementine tool
Clementine tool
 
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...SEMANCO - Integrating multiple data sources, domains and tools in  urban ener...
SEMANCO - Integrating multiple data sources, domains and tools in urban ener...
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Introduction to Data Mining

  • 1. EVALUATION AND VISUALIZATION OF DIFFERENT DATA MINING TECHNIQUES INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 2. Data Mining Process INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 3. The purpose of this project is to gain an understanding of the process of data mining by  Implementing one or more data mining algorithms  Visualizing them  Comparing their performance on datasets  Another aspect was to provide visual tutorials and detailed help about these algorithms INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 4. WHAT IS DATA MINING?  Originally developed to act as expert systems to solve problems  Data Mining can be utilized in any organization that needs to find patterns or relationships in their data.  Different types of Data Mining INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 5. BASIC FEATURES OF THE PROJECT  Handling different types of data  Pre processing of data  Algorithms implementation  Visualization of data mining model  Comparison of different data mining algorithms  Help and visual tutorials INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 6. HANDLING DIFFERENT DATA FORMATS System supports following types of data files  Text Data File Handling  CSV (Comma Separated Value) File  Any User Defined Format  Database Data File Handling  MS Access Data File  MS SQL Data File  XML Data File Handling  XML Data File INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 7. PRE PROCESSING OF DATA  Pre processing of data includes  Filling of missing values  Ignore row INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 8. ALGORITHMS’ IMPLEMENTATION  Clustering  Partitional Clustering Algorithm   K-Means Algorithm Hierarchical Clustering Algorithms  Single Linkage Algorithm  Weighted Average Algorithm  Complete Linkage Algorithm INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 9. VISUALIZATION OF DATA MINING MODEL  XYScatter Chart Visualization  Dendrogram  Pie Chart  Curve Graph INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 10. COMPARISON OF DIFFERENT DATA MINING ALGORITHMS  Data File Comparison  Running time  Memory Usage  CPU Usage  Precision/Recall INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 11. K-MEAN ALGORITHM  K-mean was introduced by MC Queen in 1967 INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 12. THE K-MEANS CLUSTERING METHOD 10 5 6 5 6 7 6 7 8 7 8 9 8 9 10 9 10 5 4 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 Assign each of the objects to most similar center 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 Update the cluster means 4 3 2 1 0 0 Arbitrarily choose K objects as initial cluster center 3 4 5 6 7 8 9 10 reassign 10 10 9 9 8 7 7 6 6 5 5 4 3 2 1 0 0 INTRODUCTION TO DATAMINING BY SUMAIRA S. 2 reassign 8 K=2 1 1 2 3 4 5 6 7 8 9 10 Update the cluster means 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10
  • 13. SINGLE LINKAGE HIERARCHICAL CLUSTERING 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 14. SINGLE LINKAGE HIERARCHICAL CLUSTERING 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 15. SINGLE LINKAGE HIERARCHICAL CLUSTERING 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster 4. Repeat INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 16. SINGLE LINKAGE HIERARCHICAL CLUSTERING 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster 4. Repeat INTRODUCTION TO DATAMINING BY SUMAIRA S.
  • 17. THANK YOU Presentation By: Sumaira Sohail. INTRODUCTION TO DATAMINING BY SUMAIRA S.