SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Clustering
Partitioning Methods
1
2
Major Clustering Approaches
 Partitioning approach:
 Construct k partitions (k <= n) and then evaluate them by some criterion, e.g.,
minimizing the sum of square errors
 Each group has at least one object, each object belongs to one group
 Iterative Relocation Technique
 Avoid Enumeration by storing the centroids
 Typical methods: k-means, k-medoids, CLARANS
 Hierarchical approach:
 Create a hierarchical decomposition of the set of data (or objects) using some
criterion
 Agglomerative Vs Divisive
 Rigid – Cannot undo
 Perform Analysis of linkages
 Integrate with iterative relocation
 Typical methods: Diana, Agnes, BIRCH
3
Major Clustering Approaches
 Density Based Methods
 Distance based methods – Spherical Clusters
 Density – For each data point within a given cluster the neighbourhood
should contain a minimum number of points
 DBSCAN, OPTICS
 Grid Based Methods
 Object space – finite number of cells forming grid structure
 Fast processing time
 Typical methods: STING, WaveCluster, CLIQUE
4
Major Clustering Approaches
 Model-based:
 A model is hypothesized for each of the clusters and tries to find the best fit
of that model to each other
 Typical methods: EM, COBWEB
 Frequent pattern-based:
 Based on the analysis of frequent patterns
 Typical methods: pCluster
 User-guided or constraint-based:
 Clustering by considering user-specified or application-specific constraints
 Typical methods: COD, constrained clustering
5
Partitioning Algorithms: Basic Concept
 Partitioning method: Construct a partition of a database D of n
objects into a set of k clusters, s.t., min sum of squared distance
 Given a k, find a partition of k clusters that optimizes the chosen
partitioning criterion
 Global optimal: exhaustively enumerate all partitions
 Heuristic methods: k-means and k-medoids algorithms
 k-means Each cluster is represented by the center of the cluster
 k-medoids or PAM (Partition around medoids) Each cluster is represented
by one of the objects in the cluster
2
1 )( iCip
k
i mpE −ΣΣ= ∈=
6
K-Means Clustering Method
 Given k, the k-means algorithm is implemented in 4
steps:
 Partition objects into k non-empty subsets
 Compute seed points as the centroids of the clusters of
the current partition. The centroid is the center (mean
point) of the cluster.
 Assign each object to the cluster with the nearest seed
point.
 Go back to Step 2, stop when no more new
assignment.
7
K-Means Clustering Method
 k-means algorithm is implemented as below:
 Input: Number of clusters – k, database of n objects
 Output: Set of k clusters that minimize the squared error
 Choose k objects as the initial cluster centers
 Repeat
 (Re)assign each object to the cluster to which the object is most
similar based on the mean value of the objects in the cluster
 Update the cluster means
Until no change
K-Means Clustering Method
8
9
K-Means Clustering Method
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
K=2
Arbitrarily choose K
object as initial
cluster center
Assign
each
objects
to most
similar
center
Update
the
cluster
means
Update
the
cluster
means
reassignreassign
10
K-Means Method
 Strength: Relatively efficient: O(tkn), where n is # objects, k is #
clusters, and t is # iterations. Normally, k, t << n.
 Comment: Often terminates at a local optimum. The global optimum
may be found using techniques such as: deterministic annealing and
genetic algorithms
 Weakness
 Applicable only when mean is defined – Categorical data
 Need to specify k, the number of clusters, in advance
 Unable to handle noisy data and outliers
 Not suitable to discover clusters with non-convex shapes
11
Variations of the K-Means Method
 A few variants of the k-means which differ in
 Selection of the initial k means
 Dissimilarity calculations
 Strategies to calculate cluster means
 Handling categorical data: k-modes
 Replacing means of clusters with modes
 Using new dissimilarity measures to deal with categorical objects
 A mixture of categorical and numerical data: k-prototype method
 Expectation Maximization
 Assigns objects to clusters based on the probability of membership
 Scalability of k-means
 Compressible, Discardable, To be maintained in main memory
 Clustering Features
12
Problem of the K-Means Method
 The k-means algorithm is sensitive to outliers
 Since an object with an extremely large value may substantially distort the
distribution of the data.
 K-Medoids: Instead of taking the mean value of the object in a cluster
as a reference point, medoids can be used, which is the most
centrally located object in a cluster.
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
13
K-Medoids Clustering Method
 PAM (Partitioning Around Medoids)
 starts from an initial set of medoids and iteratively replaces one of the
medoids by one of the non-medoids if it improves the total distance of
the resulting clustering
 All pairs are analyzed for replacement
 PAM works effectively for small data sets, but does not scale well for
large data sets
 CLARA
 CLARANS
14
K-Medoids
 Input: k, and database of n objects
 Output: A set of k clusters
 Method:
 Arbitrarily choose k objects as initial medoids
 Repeat
 Assign each remaining object to cluster with nearest medoid
 Randomly select a non-medoid orandom
 Compute cost S of swapping oj with orandom
 If S < 0 swap to form new set of k medoids
Until no change
Working Principle: Minimize sum of the dissimilarities between each object and its
corresponding reference point. That is, an absolute-error criterion is used
||1 jCjp
k
j opE −ΣΣ= ∈=
15
K-medoids
 Case 1: p currently belongs to medoid oj. If oj is replaced by orandom as a medoid and p is
closest to one of oi where i < > j then p is reassigned to oi.
 Case 2: p currently belongs to medoid oj. If oj is replaced by orandom as a medoid and p is
closest to orandom then p is reassigned to orandom.
 Case 3: p currently belongs to medoid oi (i< >j) If oj is replaced by orandom as a medoid
and p is still closest to oi assignment does not change
 Case 4: p currently belongs to medoid oi (i < > j). If oj is replaced by orandom as a medoid
and p is closest to orandom then p is reassigned to orandom.
16
K-medoids
 After reassignment difference in squared error E is
calculated. Total cost of swapping – Sum of costs
incurred by all non-medoid objects
 If total cost is negative, oj is replaced with orandom as E will
be reduced
K-medoids Algorithm
17
18
Problem with PAM
 PAM is more robust than k-means in the presence of
noise and outliers because a medoid is less influenced
by outliers or other extreme values than a mean
 PAM works efficiently for small data sets but does not
scale well for large data sets.
19
CLARA
 Clustering LARge Applications
 Choose a representative set of data
 Choose medoids from this
 Cluster
 Draw multiple such samples and apply PAM on each
 Returns best Clustering
 Effectiveness depends on Sample Size
20
CLARANS
 Clustering Large Applications based on RANdomized Search
 Uses Sampling and PAM
 Doesn’t restrict itself to any particular sample
 Performs a graph search with each node acting as a potential
solution-( k medoids)
 Clustering got after replacement – Neighbor
 Number of neighbors to be tried is limited
 Moves to better neighbour
 Silhouette Coefficient
 Complexity – O(n2
)

Weitere ähnliche Inhalte

Was ist angesagt?

Network security - OSI Security Architecture
Network security - OSI Security ArchitectureNetwork security - OSI Security Architecture
Network security - OSI Security ArchitectureBharathiKrishna6
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & UnderfittingSOUMIT KAR
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptxmaha797959
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsSalah Amean
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rulesHarini Balamurugan
 

Was ist angesagt? (20)

Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Network security - OSI Security Architecture
Network security - OSI Security ArchitectureNetwork security - OSI Security Architecture
Network security - OSI Security Architecture
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptx
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Clustering
ClusteringClustering
Clustering
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 

Andere mochten auch

USE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTORUSE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTORarpit bhadoriya
 
How I data mined my text message history
How I data mined my text message historyHow I data mined my text message history
How I data mined my text message historyJoe Cannatti Jr.
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kambererror007
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and MiningDaniel JACOB
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsSalah Amean
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDatamining Tools
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretizationKrish_ver2
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Data cube computation
Data cube computationData cube computation
Data cube computationRashmi Sheikh
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for ClassificationPrakash Pimpale
 
neural network
neural networkneural network
neural networkSTUDENT
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 

Andere mochten auch (20)

USE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTORUSE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTOR
 
How I data mined my text message history
How I data mined my text message historyHow I data mined my text message history
How I data mined my text message history
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and Mining
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data visualization
Data visualizationData visualization
Data visualization
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data cube computation
Data cube computationData cube computation
Data cube computation
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
neural network
neural networkneural network
neural network
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 

Ähnlich wie 3.2 partitioning methods

Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasicengrasi
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapterNaveenKumar5162
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapterNaveenKumar5162
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.pptLPrashanthi
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10mqasimsheikh5
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 

Ähnlich wie 3.2 partitioning methods (20)

Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 
CLUSTERING
CLUSTERINGCLUSTERING
CLUSTERING
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapter
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapter
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
k-mean-clustering.pdf
k-mean-clustering.pdfk-mean-clustering.pdf
k-mean-clustering.pdf
 
Clustering
ClusteringClustering
Clustering
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 

Mehr von Krish_ver2

5.5 back tracking
5.5 back tracking5.5 back tracking
5.5 back trackingKrish_ver2
 
5.5 back track
5.5 back track5.5 back track
5.5 back trackKrish_ver2
 
5.5 back tracking 02
5.5 back tracking 025.5 back tracking 02
5.5 back tracking 02Krish_ver2
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructuresKrish_ver2
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructuresKrish_ver2
 
5.4 randamized algorithm
5.4 randamized algorithm5.4 randamized algorithm
5.4 randamized algorithmKrish_ver2
 
5.3 dynamic programming 03
5.3 dynamic programming 035.3 dynamic programming 03
5.3 dynamic programming 03Krish_ver2
 
5.3 dynamic programming
5.3 dynamic programming5.3 dynamic programming
5.3 dynamic programmingKrish_ver2
 
5.3 dyn algo-i
5.3 dyn algo-i5.3 dyn algo-i
5.3 dyn algo-iKrish_ver2
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03Krish_ver2
 
5.2 divide and conquer
5.2 divide and conquer5.2 divide and conquer
5.2 divide and conquerKrish_ver2
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03Krish_ver2
 
5.1 greedyyy 02
5.1 greedyyy 025.1 greedyyy 02
5.1 greedyyy 02Krish_ver2
 
4.4 hashing ext
4.4 hashing  ext4.4 hashing  ext
4.4 hashing extKrish_ver2
 
4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashingKrish_ver2
 

Mehr von Krish_ver2 (20)

5.5 back tracking
5.5 back tracking5.5 back tracking
5.5 back tracking
 
5.5 back track
5.5 back track5.5 back track
5.5 back track
 
5.5 back tracking 02
5.5 back tracking 025.5 back tracking 02
5.5 back tracking 02
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructures
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructures
 
5.4 randamized algorithm
5.4 randamized algorithm5.4 randamized algorithm
5.4 randamized algorithm
 
5.3 dynamic programming 03
5.3 dynamic programming 035.3 dynamic programming 03
5.3 dynamic programming 03
 
5.3 dynamic programming
5.3 dynamic programming5.3 dynamic programming
5.3 dynamic programming
 
5.3 dyn algo-i
5.3 dyn algo-i5.3 dyn algo-i
5.3 dyn algo-i
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03
 
5.2 divide and conquer
5.2 divide and conquer5.2 divide and conquer
5.2 divide and conquer
 
5.2 divede and conquer 03
5.2 divede and conquer 035.2 divede and conquer 03
5.2 divede and conquer 03
 
5.1 greedyyy 02
5.1 greedyyy 025.1 greedyyy 02
5.1 greedyyy 02
 
5.1 greedy
5.1 greedy5.1 greedy
5.1 greedy
 
5.1 greedy 03
5.1 greedy 035.1 greedy 03
5.1 greedy 03
 
4.4 hashing02
4.4 hashing024.4 hashing02
4.4 hashing02
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
 
4.4 hashing ext
4.4 hashing  ext4.4 hashing  ext
4.4 hashing ext
 
4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashing
 
4.2 bst
4.2 bst4.2 bst
4.2 bst
 

Kürzlich hochgeladen

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 

Kürzlich hochgeladen (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 

3.2 partitioning methods

  • 2. 2 Major Clustering Approaches  Partitioning approach:  Construct k partitions (k <= n) and then evaluate them by some criterion, e.g., minimizing the sum of square errors  Each group has at least one object, each object belongs to one group  Iterative Relocation Technique  Avoid Enumeration by storing the centroids  Typical methods: k-means, k-medoids, CLARANS  Hierarchical approach:  Create a hierarchical decomposition of the set of data (or objects) using some criterion  Agglomerative Vs Divisive  Rigid – Cannot undo  Perform Analysis of linkages  Integrate with iterative relocation  Typical methods: Diana, Agnes, BIRCH
  • 3. 3 Major Clustering Approaches  Density Based Methods  Distance based methods – Spherical Clusters  Density – For each data point within a given cluster the neighbourhood should contain a minimum number of points  DBSCAN, OPTICS  Grid Based Methods  Object space – finite number of cells forming grid structure  Fast processing time  Typical methods: STING, WaveCluster, CLIQUE
  • 4. 4 Major Clustering Approaches  Model-based:  A model is hypothesized for each of the clusters and tries to find the best fit of that model to each other  Typical methods: EM, COBWEB  Frequent pattern-based:  Based on the analysis of frequent patterns  Typical methods: pCluster  User-guided or constraint-based:  Clustering by considering user-specified or application-specific constraints  Typical methods: COD, constrained clustering
  • 5. 5 Partitioning Algorithms: Basic Concept  Partitioning method: Construct a partition of a database D of n objects into a set of k clusters, s.t., min sum of squared distance  Given a k, find a partition of k clusters that optimizes the chosen partitioning criterion  Global optimal: exhaustively enumerate all partitions  Heuristic methods: k-means and k-medoids algorithms  k-means Each cluster is represented by the center of the cluster  k-medoids or PAM (Partition around medoids) Each cluster is represented by one of the objects in the cluster 2 1 )( iCip k i mpE −ΣΣ= ∈=
  • 6. 6 K-Means Clustering Method  Given k, the k-means algorithm is implemented in 4 steps:  Partition objects into k non-empty subsets  Compute seed points as the centroids of the clusters of the current partition. The centroid is the center (mean point) of the cluster.  Assign each object to the cluster with the nearest seed point.  Go back to Step 2, stop when no more new assignment.
  • 7. 7 K-Means Clustering Method  k-means algorithm is implemented as below:  Input: Number of clusters – k, database of n objects  Output: Set of k clusters that minimize the squared error  Choose k objects as the initial cluster centers  Repeat  (Re)assign each object to the cluster to which the object is most similar based on the mean value of the objects in the cluster  Update the cluster means Until no change
  • 9. 9 K-Means Clustering Method 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 K=2 Arbitrarily choose K object as initial cluster center Assign each objects to most similar center Update the cluster means Update the cluster means reassignreassign
  • 10. 10 K-Means Method  Strength: Relatively efficient: O(tkn), where n is # objects, k is # clusters, and t is # iterations. Normally, k, t << n.  Comment: Often terminates at a local optimum. The global optimum may be found using techniques such as: deterministic annealing and genetic algorithms  Weakness  Applicable only when mean is defined – Categorical data  Need to specify k, the number of clusters, in advance  Unable to handle noisy data and outliers  Not suitable to discover clusters with non-convex shapes
  • 11. 11 Variations of the K-Means Method  A few variants of the k-means which differ in  Selection of the initial k means  Dissimilarity calculations  Strategies to calculate cluster means  Handling categorical data: k-modes  Replacing means of clusters with modes  Using new dissimilarity measures to deal with categorical objects  A mixture of categorical and numerical data: k-prototype method  Expectation Maximization  Assigns objects to clusters based on the probability of membership  Scalability of k-means  Compressible, Discardable, To be maintained in main memory  Clustering Features
  • 12. 12 Problem of the K-Means Method  The k-means algorithm is sensitive to outliers  Since an object with an extremely large value may substantially distort the distribution of the data.  K-Medoids: Instead of taking the mean value of the object in a cluster as a reference point, medoids can be used, which is the most centrally located object in a cluster. 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
  • 13. 13 K-Medoids Clustering Method  PAM (Partitioning Around Medoids)  starts from an initial set of medoids and iteratively replaces one of the medoids by one of the non-medoids if it improves the total distance of the resulting clustering  All pairs are analyzed for replacement  PAM works effectively for small data sets, but does not scale well for large data sets  CLARA  CLARANS
  • 14. 14 K-Medoids  Input: k, and database of n objects  Output: A set of k clusters  Method:  Arbitrarily choose k objects as initial medoids  Repeat  Assign each remaining object to cluster with nearest medoid  Randomly select a non-medoid orandom  Compute cost S of swapping oj with orandom  If S < 0 swap to form new set of k medoids Until no change Working Principle: Minimize sum of the dissimilarities between each object and its corresponding reference point. That is, an absolute-error criterion is used ||1 jCjp k j opE −ΣΣ= ∈=
  • 15. 15 K-medoids  Case 1: p currently belongs to medoid oj. If oj is replaced by orandom as a medoid and p is closest to one of oi where i < > j then p is reassigned to oi.  Case 2: p currently belongs to medoid oj. If oj is replaced by orandom as a medoid and p is closest to orandom then p is reassigned to orandom.  Case 3: p currently belongs to medoid oi (i< >j) If oj is replaced by orandom as a medoid and p is still closest to oi assignment does not change  Case 4: p currently belongs to medoid oi (i < > j). If oj is replaced by orandom as a medoid and p is closest to orandom then p is reassigned to orandom.
  • 16. 16 K-medoids  After reassignment difference in squared error E is calculated. Total cost of swapping – Sum of costs incurred by all non-medoid objects  If total cost is negative, oj is replaced with orandom as E will be reduced
  • 18. 18 Problem with PAM  PAM is more robust than k-means in the presence of noise and outliers because a medoid is less influenced by outliers or other extreme values than a mean  PAM works efficiently for small data sets but does not scale well for large data sets.
  • 19. 19 CLARA  Clustering LARge Applications  Choose a representative set of data  Choose medoids from this  Cluster  Draw multiple such samples and apply PAM on each  Returns best Clustering  Effectiveness depends on Sample Size
  • 20. 20 CLARANS  Clustering Large Applications based on RANdomized Search  Uses Sampling and PAM  Doesn’t restrict itself to any particular sample  Performs a graph search with each node acting as a potential solution-( k medoids)  Clustering got after replacement – Neighbor  Number of neighbors to be tried is limited  Moves to better neighbour  Silhouette Coefficient  Complexity – O(n2 )