Data Mining: Implementation of Data Mining Techniques using RapidMiner software

•Als PPTX, PDF herunterladen•

1 gefällt mir•1,144 views

Mohammed Kharma

Data Mining: Implementation of Data Mining Techniques using RapidMiner software presentation

Daten & Analysen Technologie

Data Mining:
Implementation of Data
Mining Techniques using
RapidMiner software
Prepared by
Mohammed Kharma

Definitions review
• Cluster: A collection of data objects
– similar (or related) to one another within the
same group
– dissimilar (or unrelated) to the objects in other
groups
• Cluster analysis
– Finding similarities between data according to the
characteristics found in the data and grouping
similar data objects into clusters

Clustering Methods
• Partitioning :
– Unsupervised learning algorithms, Construct various
partitions and then evaluate them by some criterion,
e.g., minimizing the sum of square errors
– Typical methods: k-means, k-medoids
• Hierarchical :
– Create a hierarchical decomposition of the set of data
(or objects) using some criterion
– Typical methods: Diana, Agnes, BIRCH, ROCK,
CAMELEON

Illustration & compression of 2
clustering technique using
Rapidminer tool and Java
application

illustrate of 2 clustering technique
using Rapidminer tool and Java
• K-means algorithm:
We performed two test
1. Using java program: program parameters
K = 2;
Data:
22 21
19 20
18 22
1 3
3 2

6
K-means Clustering
• Input: the number of clusters K and the collection of n
instances
• Output: a set of k clusters that minimizes the squared error
criterion
• Method:
– Arbitrarily choose k instances as the initial cluster centers
– Repeat
• (Re)assign each instance to the cluster to which the
instance is the most similar, based on the mean value of
the instances in the cluster
• Update cluster means (compute mean value of the
instances for each cluster)
– Until no change in the assignment
• Squared Error Criterion
– E = ∑i=1 k ∑ pЄCi |p-mi|2
– where mi are the cluster means and p are points in clusters

Continued-The result of K-Means-
RapidMiner

11
K-medoids
• Input: the number of clusters K and the collection of n
instances
• Output: A set of k clusters that minimizes the sum of the
dissimilarities of all the instances to their nearest medoids
• Method:
– Arbitrarily choose k instances as the initial medoids
– Repeat
• (Re)assign each remaining instance to the cluster with
the nearest medoid
• Randomly select a non-medoid instance, or
• Compute the total cost, S, of swapping Oj with Or
• If S<0 then swap Oj with Or to form the new set of k
medoids
– Until no change

Java Live Demo:
http://home.dei.polimi.it/matteucc/Clustering/t
utorial_html/AppletKM.html

Comparison
The results of both algorithms are the same
Both require K to be specified in the
input
K-medoids is less influenced by outliers in the
data
Both methods assign each instance exactly to
one cluster

Weitere ähnliche Inhalte

Was ist angesagt?

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf

CS267_Graph_LabJaideepKatkar

How to use Map() Filter() and Reduce() functions in Python | EdurekaEdureka!

Scaling out logistic regression with SparkBarak Gitsis

MLconf NYC Xiangrui MengMLconf

Java - CollectionsAmith jayasekara

Joey gonzalez, graph lab, m lconf 2013MLconf

Flexible Memory Allocation in Kinetic Monte Carlo SimulationsAaron Craig

Gelly in Apache Flink Bay Area MeetupVasia Kalavri

[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...Daiki Tanaka

ArrayIama Marsian

Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16MLconf

Modelling Accessibility Performance in LTE networks, An Analytics Methodologyalien_gmx

Unsupervised Learning: Clustering Experfy

[ppt]butest

Quick and Heap Sort with examplesBst Ali

Generalized Linear Models with H2O Sri Ambati

Introduction to Data ScienceSridhara R

0415_seminar_DeepDPGHye-min Ahn

Heapsort quick sortDr Sandeep Kumar Poonia

Was ist angesagt? (20)

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...

CS267_Graph_Lab

How to use Map() Filter() and Reduce() functions in Python | Edureka

Scaling out logistic regression with Spark

MLconf NYC Xiangrui Meng

Java - Collections

Joey gonzalez, graph lab, m lconf 2013

Flexible Memory Allocation in Kinetic Monte Carlo Simulations

Gelly in Apache Flink Bay Area Meetup

[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...

Array

Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16

Modelling Accessibility Performance in LTE networks, An Analytics Methodology

Unsupervised Learning: Clustering

[ppt]

Quick and Heap Sort with examples

Generalized Linear Models with H2O

Introduction to Data Science

0415_seminar_DeepDPG

Heapsort quick sort

Andere mochten auch

RapidminerGernot Schulmeister

Introduction to RapidMiner Studio V7geraldinegray

RapidMiner: Introduction To Rapid MinerRapidmining Content

Data mining toolssuganmca14

Slides PAPIs.io'14 RapidMinerSabrina Kirstein

Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...Cloudera, Inc.

Data mining tools overallMohamed Sharique Vellikan

M Chambers and RapidMiner Overview for Babson classmcAnalytics99

RapidMiner, an entrance to explore MIMIC-III?Sven Van Poucke, MD, PhD

Data Analytics.01. Data selection and captureAlex Rayón Jerez

Predictive ModellingRajiv Advani

Predictive Modeling and Analytics select_chaptersJeffrey Strickland, Ph.D., CMSP

predictive modelsJeffrey Strickland, Ph.D., CMSP

Introduction to Text Classification with RapidMiner Studio 7Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University

My First Data Science Project (using Rapid Miner)Data Science Thailand

Search Twitter with RapidMiner Studio 6Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University

Predictive Analytics World Berlin 2016 Rising Media Ltd.

Introduction to predictive modeling v1Venkata Reddy Konasani

Advanced Predictive Modeling with R and RapidMiner Studio 7Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University

Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsSalah Amean

Andere mochten auch (20)

Rapidminer

Introduction to RapidMiner Studio V7

RapidMiner: Introduction To Rapid Miner

Data mining tools

Slides PAPIs.io'14 RapidMiner

Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...

Data mining tools overall

M Chambers and RapidMiner Overview for Babson class

RapidMiner, an entrance to explore MIMIC-III?

Data Analytics.01. Data selection and capture

Predictive Modelling

Predictive Modeling and Analytics select_chapters

predictive models

Introduction to Text Classification with RapidMiner Studio 7

My First Data Science Project (using Rapid Miner)

Search Twitter with RapidMiner Studio 6

Predictive Analytics World Berlin 2016

Introduction to predictive modeling v1

Advanced Predictive Modeling with R and RapidMiner Studio 7

Data Mining: Concepts and techniques classification _chapter 9 :advanced methods

Ähnlich wie Data Mining: Implementation of Data Mining Techniques using RapidMiner software

3.2 partitioning methodsKrish_ver2

Clustering TheorySSA KPI

Advanced database and data mining & clustering conceptsNithyananthSengottai

UNIT_V_Cluster Analysis.pptxsandeepsandy494692

Data mining techniques unit vmalathieswaran29

ClusetrigBasic.pptChaitanyaKulkarni451137

Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16MLconf

Clustering on database systems rkmVahid Mirjalili

Dataa miiningSUBBIAH SURESH

26-Clustering MTech-2017.pptvikassingh569137

CSA 3702 machine learning module 3Nandhini S

machine learning - Clustering in RSudhakar Chavan

Unsupervised learning Modi.pptxssusere1fd42

K means clustering algorithmDarshak Mehta

Cluster Analysis.pptxRvishnupriya2

Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul

3.5 model based clusteringKrish_ver2

Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean

Knn 160904075605-convertedrameswara reddy venkat

Machine learning clusteringCosmoAIMS Bassett

Ähnlich wie Data Mining: Implementation of Data Mining Techniques using RapidMiner software (20)

3.2 partitioning methods

Clustering Theory

Advanced database and data mining & clustering concepts

UNIT_V_Cluster Analysis.pptx

Data mining techniques unit v

ClusetrigBasic.ppt

Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16

Clustering on database systems rkm

Dataa miining

26-Clustering MTech-2017.ppt

CSA 3702 machine learning module 3

machine learning - Clustering in R

Unsupervised learning Modi.pptx

K means clustering algorithm

Cluster Analysis.pptx

Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt

3.5 model based clustering

Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...

Knn 160904075605-converted

Machine learning clustering

Mehr von Mohammed Kharma

Data Mining Project for student academic specialization and performanceMohammed Kharma

Cloud Computing PresentationMohammed Kharma

A data mining framework for fraud detection in telecom based on MapReduce (Pr...Mohammed Kharma

How to speedup GWT compilerMohammed Kharma

37 c 551 - reduced changes in the carrier of steganography algorithmMohammed Kharma

Learning objects and metadata framework - Mohammed KharmaMohammed Kharma

Mohammed Kharma-A flexible framework for quality assurance and testing of sof...Mohammed Kharma

Mohammed Kharma - A flexible framework for quality assurance and testing of s...Mohammed Kharma

Mehr von Mohammed Kharma (8)

Data Mining Project for student academic specialization and performance

Cloud Computing Presentation

A data mining framework for fraud detection in telecom based on MapReduce (Pr...

How to speedup GWT compiler

37 c 551 - reduced changes in the carrier of steganography algorithm

Learning objects and metadata framework - Mohammed Kharma

Mohammed Kharma-A flexible framework for quality assurance and testing of sof...

Mohammed Kharma - A flexible framework for quality assurance and testing of s...

Kürzlich hochgeladen

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823

Halmar dropshipping via API with DroFxolyaivanovalion

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums

Midocean dropshipping via API with DroFxolyaivanovalion

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

April 2024 - Crypto Market Report's Analysismanisha194592

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823

Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

Kürzlich hochgeladen (20)

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore

Halmar dropshipping via API with DroFx

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Midocean dropshipping via API with DroFx

CebaBaby dropshipping via API with DroFX.pptx

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand

Generative AI on Enterprise Cloud with NiFi and Milvus

April 2024 - Crypto Market Report's Analysis

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Data Mining: Implementation of Data Mining Techniques using RapidMiner software

1. Data Mining: Implementation of Data Mining Techniques using RapidMiner software Prepared by Mohammed Kharma

2. Definitions review • Cluster: A collection of data objects – similar (or related) to one another within the same group – dissimilar (or unrelated) to the objects in other groups • Cluster analysis – Finding similarities between data according to the characteristics found in the data and grouping similar data objects into clusters

3. Clustering Methods • Partitioning : – Unsupervised learning algorithms, Construct various partitions and then evaluate them by some criterion, e.g., minimizing the sum of square errors – Typical methods: k-means, k-medoids • Hierarchical : – Create a hierarchical decomposition of the set of data (or objects) using some criterion – Typical methods: Diana, Agnes, BIRCH, ROCK, CAMELEON

4. Illustration & compression of 2 clustering technique using Rapidminer tool and Java application

5. illustrate of 2 clustering technique using Rapidminer tool and Java • K-means algorithm: We performed two test 1. Using java program: program parameters K = 2; Data: 22 21 19 20 18 22 1 3 3 2

6. 6 K-means Clustering • Input: the number of clusters K and the collection of n instances • Output: a set of k clusters that minimizes the squared error criterion • Method: – Arbitrarily choose k instances as the initial cluster centers – Repeat • (Re)assign each instance to the cluster to which the instance is the most similar, based on the mean value of the instances in the cluster • Update cluster means (compute mean value of the instances for each cluster) – Until no change in the assignment • Squared Error Criterion – E = ∑i=1 k ∑ pЄCi |p-mi|2 – where mi are the cluster means and p are points in clusters

7. The result K-Means-java program

8. The result of K-Means-RapidMiner

9. The result of K-Means-RapidMiner

10. Continued-The result of K-Means- RapidMiner

11. 11 K-medoids • Input: the number of clusters K and the collection of n instances • Output: A set of k clusters that minimizes the sum of the dissimilarities of all the instances to their nearest medoids • Method: – Arbitrarily choose k instances as the initial medoids – Repeat • (Re)assign each remaining instance to the cluster with the nearest medoid • Randomly select a non-medoid instance, or • Compute the total cost, S, of swapping Oj with Or • If S<0 then swap Oj with Or to form the new set of k medoids – Until no change

12. The result of k-medoids-RapidMiner

13. The result of k-medoids-RapidMiner

14. Java Live Demo: http://home.dei.polimi.it/matteucc/Clustering/t utorial_html/AppletKM.html

15. Comparison The results of both algorithms are the same Both require K to be specified in the input K-medoids is less influenced by outliers in the data Both methods assign each instance exactly to one cluster

16. »Thank you

Data Mining: Implementation of Data Mining Techniques using RapidMiner software

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Data Mining: Implementation of Data Mining Techniques using RapidMiner software

Ähnlich wie Data Mining: Implementation of Data Mining Techniques using RapidMiner software (20)

Mehr von Mohammed Kharma

Mehr von Mohammed Kharma (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Mining: Implementation of Data Mining Techniques using RapidMiner software