SlideShare ist ein Scribd-Unternehmen logo
1 von 55
CLUSTER ANALYSIS
DR ATHAR KHAN
LIAQUAT COLLEGE OF MEDICINE & DENTISTRY
matharm@yahoo.com
4/17/2020 DR ATHAR KHAN 2
DEFINITION
• Cluster Analysis is a way of grouping cases of data
based on the similarity of responses to several
variables.
▪ The fundamental problem clustering address is to
divide the data into meaningful groups (clusters).
Group Together Variables
Grouping Cases
Factor Analysis
Cluster Analysis
4/17/2020 DR ATHAR KHAN 3
4/17/2020 DR ATHAR KHAN 4
4/17/2020 DR ATHAR KHAN 5
4/17/2020 DR ATHAR KHAN 6
4/17/2020 DR ATHAR KHAN 7
4/17/2020 DR ATHAR KHAN 8
4/17/2020 DR ATHAR KHAN 9
4/17/2020 DR ATHAR KHAN 10
Cluster 1
Cluster 2
Cluster 3
4/17/2020 DR ATHAR KHAN 11
Unsupervised learning is a machine learning technique, where you do not need to
supervise the model. Instead, you need to allow the model to work on its own to
discover information, only have input data (X) and no corresponding output variables.4/17/2020 DR ATHAR KHAN 12
Types of Data
▪ The data used in cluster analysis can be interval,
ordinal or categorical.
▪ However, having a mixture of different types of
variable will make the analysis more complicated.
▪ This is because in cluster analysis you need to have
some way of measuring the distance between
observations and the type of measure used will
depend on what type of data you have.
4/17/2020 DR ATHAR KHAN 13
Measures of Distance
▪ A number of different measures have been proposed
to measure ’distance’ for categorical data:
▪ K-Means algorithm for categorical data, ROCK, LIMBO,
CLICKS, Ward’s agglomerativealgorithm
▪ In a hierarchical clustering algorithm most used is Ward’s.
▪ It is the most widely used method for measuring the
distance between the objects for interval data is
Euclidean Distance.
4/17/2020 DR ATHAR KHAN 14
Euclidean Distance, d
Euclidean distance is the geometric distance
between two objects (or cases). Therefore, if we
were to call George subject i and Zippy subject j,
then we could express their Euclidean distance in
terms of the following equation:
Euclidean distances the smaller the distance, the
more similar the cases.4/17/2020 DR ATHAR KHAN 15
Measures of Distance
▪ When using a measure such as the Euclidean
distance, the scale of measurement of the variables
under consideration is an issue, as changing the scale
will obviously effect the distance between subjects
(e.g. a difference of 10cm could being a difference of
100mm).
▪ To get around this problem each variable can be
standardized (converted to z-scores).
4/17/2020 DR ATHAR KHAN 16
Approaches to Cluster Analysis
▪ There are a number of different methods that can be
used to carry out a cluster analysis:
▪ Hierarchical methods
▪ – Agglomerative methods
▪ – Divisive methods
▪ Non-hierarchical methods (often known as k-means
clustering methods)
4/17/2020 DR ATHAR KHAN 17
Agglomerative Methods
▪ Agglomerative clustering is Bottom-up technique start by
considering each data point as its own cluster and
merging them together into larger groups from the
bottom up into a single giant cluster.
4/17/2020 DR ATHAR KHAN 18
Divisive Clustering
▪ Divisive clustering is the opposite, it starts with one
cluster, which is then divided in two as a function of the
similarities or distances in the data. These new clusters
are then divided, and so on until each case is a cluster.
Agglomerative
methods are
used more
often than
Divisive
methods
4/17/2020 DR ATHAR KHAN 19
4/17/2020 DR ATHAR KHAN 20
Hierarchical agglomerative methods
Within this approach to cluster analysis there are a number of different
methods used to determine which clusters should be joined at each stage.
Linkage Function/Creating the Clusters
4/17/2020 DR ATHAR KHAN 21
Nearest neighbour method (single linkage method)
In this method the distance between two clusters is defined to be the distance
between the two closest members, or neighbours.
Furthest neighbour method (complete linkage method)
In this case the distance between two clusters is defined to be the maximum
distance between members — i.e. the distance between the two subjects that
are furthest apart.
4/17/2020 DR ATHAR KHAN 22
Average (between groups) linkage method (sometimes referred to as
UPGMA)
The distance between two clusters is calculated as the average distance
between all pairs of subjects in the two clusters.
Centroid Method
Here the centroid (mean value for each variable) of each cluster is calculated
and the distance between centroids is used. Clusters whose centroids are
closest together are merged.
4/17/2020 DR ATHAR KHAN 23
Ward’s Method
▪ In this method all possible pairs of clusters are combined and
the sum of the squared distances within each cluster is
calculated.
▪ This is then summed over all clusters.
▪ The combination that gives the lowest sum of squares is
chosen.
▪ The aim in Ward’s method is to join cases into clusters such
that the variance within a cluster is minimised.
▪ To be more precise, two clusters are merged if this merger
results in the minimum increase in the error sum of squares.
▪ Most popular Method
4/17/2020 DR ATHAR KHAN 24
Selecting the optimum number of clusters
▪ Once the cluster analysis has been carried out it is then necessary to
select the ’best’ cluster solution.
▪ # of clusters and within cluster variances
4/17/2020 DR ATHAR KHAN 25
Dendrogram
1
2
34
In the dendrogram above, the height of the
dendrogram indicates the order in which the
clusters were joined.
Dendrograms cannot tell you how many clusters
you should have4/17/2020 DR ATHAR KHAN 26
Data Preparation
• To perform a cluster analysis, generally, the data
should be prepared as follows:
• Any missing value in the data must be removed or
estimated.
• The data must be standardized(Z SCORES)
4/17/2020 DR ATHAR KHAN 27
Limitations of Cluster Analysis
• There are several things to be aware of when conducting
cluster analysis:
– The different methods of clustering usually give very different results.
This occurs because of the different criterion for merging clusters
(including cases). It is important to think carefully about which method
is best for what you are interested in looking at.
– With the exception of simple linkage, the results will be affected by
the way in which the variables are ordered.
– The analysis is not stable when cases are dropped: this occurs because
selection of a case (or merger of clusters) depends on similarity of one
case to the cluster.
4/17/2020 DR ATHAR KHAN 28
Limitations of Cluster Analysis
• Imagine we wanted to look at clusters of cases
referred for psychiatric treatment.
• We measured each subject on four questionnaires:
Spielberger Trait Anxiety Inventory (STAI), the Beck
Depression Inventory (BDI), a measure of Intrusive
Thoughts and Rumination (IT) and a measure of
Impulsive Thoughts and Actions (Impulse).
• The rationale behind this analysis is that people with
the same disorder should report a similar pattern of
scores across the measures (so the profiles of their
responses should be similar)
4/17/2020 DR ATHAR KHAN 29
Video : Hierarchical Clustering : Agglomerative Clustering and
Divisive Clustering
https://www.youtube.com/watch?v=7enWesSofhg
4/17/2020 DR ATHAR KHAN 30
4/17/2020 DR ATHAR KHAN 31
4/17/2020 DR ATHAR KHAN 32
4/17/2020 DR ATHAR KHAN 33
4/17/2020 DR ATHAR KHAN 34
4/17/2020 DR ATHAR KHAN 35
Agglomeration schedule: Shows how the clusters are combined at each stage.
Stage 1: Cases 1 and 4 have the smallest distance ("Coefficients" = .168) => first
cluster {1,4}
Stage 2: Cases 10 and 12 have the second smallest distance => second cluster
{10,12}4/17/2020 DR ATHAR KHAN 36
STAGE 1
STAGE 7
STAGE 3
STAGE 4
STAGE 5
STAGE 2
STAGE 6
4/17/2020 DR ATHAR KHAN 37
Agglomeration schedule: Shows how the clusters are combined at each stage.
The next part of the table shows the stage at which each cluster first appears.
4/17/2020 DR ATHAR KHAN 38
Agglomeration schedule: Shows how the clusters are combined at each stage.
In stage 6, cluster 1 is the cluster that was formed in stage 1...
4/17/2020 DR ATHAR KHAN 39
Agglomeration schedule: Shows how the clusters are combined at each stage.
Stage 1: Cases 1 and 4 have the smallest distance ("Coefficients" = .168) => first cluster
{1,4}
First cluster {1,4} is merged with case 13 in stage 6 ("Next Stage") => Cluster {1,4,13}
0 means first time
4/17/2020 DR ATHAR KHAN 40
STAGE 1
STAGE 2
STAGE 5
4/17/2020 DR ATHAR KHAN 41
▪ The Coefficients column indicates the distance between the two clusters (or
cases) joined at each stage.
▪ The values here depend on the proximity measure and linkage method used
in the analysis.
▪ For a good cluster solution, you will see a sudden jump in the distance
coefficient as you read down the table.
▪ The stage before the sudden change indicates the optimal stopping point for
merging clusters.
3 clusters
2 Clusters
1 Cluster
4/17/2020 DR ATHAR KHAN 42
NUMBER OF CLUSTERS
▪ Number of cases 15
▪ Step of ‘elbow’ 12
15 – 12
Number of clusters 3
4/17/2020 DR ATHAR KHAN 43
Select
Coefficients
4/17/2020 DR ATHAR KHAN 44
Scree Plot
.000
2.000
4.000
6.000
8.000
10.000
12.000
14.000
16.000
18.000
20.000
1 2 3 4 5 6 7 8 9 10 11 12 13 14
4/17/2020 DR ATHAR KHAN 45
▪ Notice how the "branches" merge together as you look from left to right in the
dendrogram.
▪ Cases or clusters that are joined by lines "further down" the tree (near the left side
of the dendrogram) are very similar.
The dendrogram (or "tree diagram") shows relative similarities between cases.
4/17/2020 DR ATHAR KHAN 46
▪ Cases or clusters that are joined by lines "further up" the tree (near the right side)
are dissimilar.
▪ Cluster distances are rescaled so that they range from 0 to 25 in this plot.
4/17/2020 DR ATHAR KHAN 47
▪ This would identify 3 clusters (GREEN), one for each point where a branch intersects
our line.
▪ By considering different cut points for our line, we can get solutions with different
numbers of cluster.
▪ A good cluster solution is one with small within-cluster distances, but large between
cluster distances.
1
2
3
4/17/2020 DR ATHAR KHAN 48
▪ Choose the number of clusters within the largest increase in heterogeneity.
1
2
3
Standardized distance
4/17/2020 DR ATHAR KHAN 49
▪ This table shows cluster membership for each case, according to the
number of clusters you requested.
▪ You can attempt to interpret the clusters by observing which cases are
grouped together.
4/17/2020 DR ATHAR KHAN 50
▪ This table shows cluster membership for each case, according to the
number of clusters you requested.
▪ You can attempt to interpret the clusters by observing which cases are
grouped together.
4/17/2020 DR ATHAR KHAN 51
4/17/2020 DR ATHAR KHAN 52
▪ Having eyeballed the dendrogram and decided how many
clusters are present it is possible to re-run the analysis asking
SPSS to save a new variable in which cluster codes are assigned
to cases (with the researcher specifying the number of clusters
in the data).
▪ For these data, we saw three clear clusters and so we could re-
run the analysis asking for cluster group codings for three
clusters (in fact, I told you to do this as part of the original
analysis).
▪ The output below shows the resulting codes for each case in this
analysis. It’s pretty clear that these codes map exactly onto the
DSM-IV classifications.
4/17/2020 DR ATHAR KHAN 53
▪ This table shows cluster membership for each case, according to the
number of clusters you requested.
▪ You can attempt to interpret the clusters by observing which cases are
grouped together.
4/17/2020 DR ATHAR KHAN 54
4/17/2020 DR ATHAR KHAN 55
DR ATHAR KHAN
MBBS, MCPS, DPH, DCPS-HCSM, DCPS-HPE, MBA, PGD-
STATISTICS, CCRP
ASSOCIATE PROFESSOR
DEPARTMENT OF COMMUNITY MEDICINE
LIAQUAT COLLEGE OF MEDICINE & DENTISTRY
KARACHI, PAKISTAN
0092-3232135932

Weitere ähnliche Inhalte

Was ist angesagt? (20)

Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Clustering
ClusteringClustering
Clustering
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster validation
Cluster validationCluster validation
Cluster validation
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Clustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesClustering - Machine Learning Techniques
Clustering - Machine Learning Techniques
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
Decision tree
Decision treeDecision tree
Decision tree
 
What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 

Ähnlich wie Cluster Analysis

cluster analysis(1).pptxbfdhdhhthjhfghhj
cluster analysis(1).pptxbfdhdhhthjhfghhjcluster analysis(1).pptxbfdhdhhthjhfghhj
cluster analysis(1).pptxbfdhdhhthjhfghhjKaranSingh784447
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis Baivab Nag
 
[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf4NM20IS025BHUSHANNAY
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringDr Nisha Arora
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningPyingkodi Maran
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxShwetapadmaBabu1
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...ABINASHPADHY6
 
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET Journal
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmIOSR Journals
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxnikshaikh786
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesFarzad Nozarian
 

Ähnlich wie Cluster Analysis (20)

cluster analysis(1).pptxbfdhdhhthjhfghhj
cluster analysis(1).pptxbfdhdhhthjhfghhjcluster analysis(1).pptxbfdhdhhthjhfghhj
cluster analysis(1).pptxbfdhdhhthjhfghhj
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
 
[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Di35605610
Di35605610Di35605610
Di35605610
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
 
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
 
Performance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering AlgorithmPerformance Analysis of Different Clustering Algorithm
Performance Analysis of Different Clustering Algorithm
 
F017132529
F017132529F017132529
F017132529
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
Data discretization
Data discretizationData discretization
Data discretization
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And Strategies
 

Mehr von Dr Athar Khan

Growth Chart, GROWTH MONITORING, MALNUTRITION
Growth Chart,  GROWTH MONITORING, MALNUTRITIONGrowth Chart,  GROWTH MONITORING, MALNUTRITION
Growth Chart, GROWTH MONITORING, MALNUTRITIONDr Athar Khan
 
Rheumatic heart disease
Rheumatic heart disease Rheumatic heart disease
Rheumatic heart disease Dr Athar Khan
 
Prevention of Hypertension
Prevention of Hypertension Prevention of Hypertension
Prevention of Hypertension Dr Athar Khan
 
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysis
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysisItem Analysis, Difficulty Index, Discrimination Index,ExamAnalysis
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysisDr Athar Khan
 
Genomics Genetics Community Public Health
Genomics Genetics Community Public Health Genomics Genetics Community Public Health
Genomics Genetics Community Public Health Dr Athar Khan
 
Ethics Professionalism
Ethics Professionalism Ethics Professionalism
Ethics Professionalism Dr Athar Khan
 
Professionalism Ethics
Professionalism EthicsProfessionalism Ethics
Professionalism EthicsDr Athar Khan
 
Critical thinking analysis skills
Critical thinking analysis skillsCritical thinking analysis skills
Critical thinking analysis skillsDr Athar Khan
 
Causation of Disease
Causation of DiseaseCausation of Disease
Causation of DiseaseDr Athar Khan
 
Environmental Health
Environmental HealthEnvironmental Health
Environmental HealthDr Athar Khan
 
Introduction to Environmental Health
Introduction to Environmental HealthIntroduction to Environmental Health
Introduction to Environmental HealthDr Athar Khan
 
Who is an author ? Authorship Criteria
Who is an author ? Authorship CriteriaWho is an author ? Authorship Criteria
Who is an author ? Authorship CriteriaDr Athar Khan
 
Writing Introduction Background Literature Review
Writing Introduction Background Literature ReviewWriting Introduction Background Literature Review
Writing Introduction Background Literature ReviewDr Athar Khan
 
Searching for literature review
Searching for literature reviewSearching for literature review
Searching for literature reviewDr Athar Khan
 
How to read a scientific paper
How to read a scientific paperHow to read a scientific paper
How to read a scientific paperDr Athar Khan
 
Selection of Topic for Research
Selection of Topic  for ResearchSelection of Topic  for Research
Selection of Topic for ResearchDr Athar Khan
 
Healthcare care waste management
Healthcare care waste managementHealthcare care waste management
Healthcare care waste managementDr Athar Khan
 

Mehr von Dr Athar Khan (20)

Growth Chart, GROWTH MONITORING, MALNUTRITION
Growth Chart,  GROWTH MONITORING, MALNUTRITIONGrowth Chart,  GROWTH MONITORING, MALNUTRITION
Growth Chart, GROWTH MONITORING, MALNUTRITION
 
Rheumatic heart disease
Rheumatic heart disease Rheumatic heart disease
Rheumatic heart disease
 
Prevention of Hypertension
Prevention of Hypertension Prevention of Hypertension
Prevention of Hypertension
 
Reference writing
Reference writing Reference writing
Reference writing
 
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysis
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysisItem Analysis, Difficulty Index, Discrimination Index,ExamAnalysis
Item Analysis, Difficulty Index, Discrimination Index,ExamAnalysis
 
Genomics Genetics Community Public Health
Genomics Genetics Community Public Health Genomics Genetics Community Public Health
Genomics Genetics Community Public Health
 
Ethics Professionalism
Ethics Professionalism Ethics Professionalism
Ethics Professionalism
 
Professionalism Ethics
Professionalism EthicsProfessionalism Ethics
Professionalism Ethics
 
Critical thinking analysis skills
Critical thinking analysis skillsCritical thinking analysis skills
Critical thinking analysis skills
 
Health System
Health SystemHealth System
Health System
 
Causation of Disease
Causation of DiseaseCausation of Disease
Causation of Disease
 
Environmental Health
Environmental HealthEnvironmental Health
Environmental Health
 
Introduction to Environmental Health
Introduction to Environmental HealthIntroduction to Environmental Health
Introduction to Environmental Health
 
Who is an author ? Authorship Criteria
Who is an author ? Authorship CriteriaWho is an author ? Authorship Criteria
Who is an author ? Authorship Criteria
 
Writing Introduction Background Literature Review
Writing Introduction Background Literature ReviewWriting Introduction Background Literature Review
Writing Introduction Background Literature Review
 
Searching for literature review
Searching for literature reviewSearching for literature review
Searching for literature review
 
How to read a scientific paper
How to read a scientific paperHow to read a scientific paper
How to read a scientific paper
 
Selection of Topic for Research
Selection of Topic  for ResearchSelection of Topic  for Research
Selection of Topic for Research
 
Healthcare care waste management
Healthcare care waste managementHealthcare care waste management
Healthcare care waste management
 
Ergonomics
ErgonomicsErgonomics
Ergonomics
 

Kürzlich hochgeladen

Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunDehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunSheetaleventcompany
 
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...soniya pandit
 
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsAppMost Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsAppjimmihoslasi
 
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...Oleg Kshivets
 
Circulatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsCirculatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsMedicoseAcademics
 
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...Sheetaleventcompany
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Sheetaleventcompany
 
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdfShazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdfTrustlife
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableJanvi Singh
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxSwetaba Besh
 
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...Angel
 
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...dishamehta3332
 
Intramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptxIntramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptxsaranpratha12
 
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...Sheetaleventcompany
 
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...Sheetaleventcompany
 
Control of Local Blood Flow: acute and chronic
Control of Local Blood Flow: acute and chronicControl of Local Blood Flow: acute and chronic
Control of Local Blood Flow: acute and chronicMedicoseAcademics
 
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...Namrata Singh
 
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...Sheetaleventcompany
 
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋mahima pandey
 
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...Sheetaleventcompany
 

Kürzlich hochgeladen (20)

Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service DehradunDehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
Dehradun Call Girl Service ❤️🍑 8854095900 👄🫦Independent Escort Service Dehradun
 
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
🚺LEELA JOSHI WhatsApp Number +91-9930245274 ✔ Unsatisfied Bhabhi Call Girls T...
 
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsAppMost Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
Most Beautiful Call Girl in Chennai 7427069034 Contact on WhatsApp
 
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
 
Circulatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanismsCirculatory Shock, types and stages, compensatory mechanisms
Circulatory Shock, types and stages, compensatory mechanisms
 
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
Premium Call Girls Nagpur {9xx000xx09} ❤️VVIP POOJA Call Girls in Nagpur Maha...
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
 
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdfShazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...
Bandra East [ best call girls in Mumbai Get 50% Off On VIP Escorts Service 90...
 
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
 
Intramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptxIntramuscular & Intravenous Injection.pptx
Intramuscular & Intravenous Injection.pptx
 
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
❤️Amritsar Escorts Service☎️9815674956☎️ Call Girl service in Amritsar☎️ Amri...
 
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...
👉Chandigarh Call Girl Service📲Niamh 8868886958 📲Book 24hours Now📲👉Sexy Call G...
 
Control of Local Blood Flow: acute and chronic
Control of Local Blood Flow: acute and chronicControl of Local Blood Flow: acute and chronic
Control of Local Blood Flow: acute and chronic
 
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...Kolkata Call Girls Naktala  💯Call Us 🔝 8005736733 🔝 💃  Top Class Call Girl Se...
Kolkata Call Girls Naktala 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
 
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...
Call Girl In Indore 📞9235973566📞 Just📲 Call Inaaya Indore Call Girls Service ...
 
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls KPHB 7877925207 ₹5000 To 25K With AC Room 💚😋
 
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
💚Chandigarh Call Girls 💯Riya 📲🔝8868886958🔝Call Girls In Chandigarh No💰Advance...
 

Cluster Analysis

  • 1. CLUSTER ANALYSIS DR ATHAR KHAN LIAQUAT COLLEGE OF MEDICINE & DENTISTRY matharm@yahoo.com
  • 3. DEFINITION • Cluster Analysis is a way of grouping cases of data based on the similarity of responses to several variables. ▪ The fundamental problem clustering address is to divide the data into meaningful groups (clusters). Group Together Variables Grouping Cases Factor Analysis Cluster Analysis 4/17/2020 DR ATHAR KHAN 3
  • 11. Cluster 1 Cluster 2 Cluster 3 4/17/2020 DR ATHAR KHAN 11
  • 12. Unsupervised learning is a machine learning technique, where you do not need to supervise the model. Instead, you need to allow the model to work on its own to discover information, only have input data (X) and no corresponding output variables.4/17/2020 DR ATHAR KHAN 12
  • 13. Types of Data ▪ The data used in cluster analysis can be interval, ordinal or categorical. ▪ However, having a mixture of different types of variable will make the analysis more complicated. ▪ This is because in cluster analysis you need to have some way of measuring the distance between observations and the type of measure used will depend on what type of data you have. 4/17/2020 DR ATHAR KHAN 13
  • 14. Measures of Distance ▪ A number of different measures have been proposed to measure ’distance’ for categorical data: ▪ K-Means algorithm for categorical data, ROCK, LIMBO, CLICKS, Ward’s agglomerativealgorithm ▪ In a hierarchical clustering algorithm most used is Ward’s. ▪ It is the most widely used method for measuring the distance between the objects for interval data is Euclidean Distance. 4/17/2020 DR ATHAR KHAN 14
  • 15. Euclidean Distance, d Euclidean distance is the geometric distance between two objects (or cases). Therefore, if we were to call George subject i and Zippy subject j, then we could express their Euclidean distance in terms of the following equation: Euclidean distances the smaller the distance, the more similar the cases.4/17/2020 DR ATHAR KHAN 15
  • 16. Measures of Distance ▪ When using a measure such as the Euclidean distance, the scale of measurement of the variables under consideration is an issue, as changing the scale will obviously effect the distance between subjects (e.g. a difference of 10cm could being a difference of 100mm). ▪ To get around this problem each variable can be standardized (converted to z-scores). 4/17/2020 DR ATHAR KHAN 16
  • 17. Approaches to Cluster Analysis ▪ There are a number of different methods that can be used to carry out a cluster analysis: ▪ Hierarchical methods ▪ – Agglomerative methods ▪ – Divisive methods ▪ Non-hierarchical methods (often known as k-means clustering methods) 4/17/2020 DR ATHAR KHAN 17
  • 18. Agglomerative Methods ▪ Agglomerative clustering is Bottom-up technique start by considering each data point as its own cluster and merging them together into larger groups from the bottom up into a single giant cluster. 4/17/2020 DR ATHAR KHAN 18
  • 19. Divisive Clustering ▪ Divisive clustering is the opposite, it starts with one cluster, which is then divided in two as a function of the similarities or distances in the data. These new clusters are then divided, and so on until each case is a cluster. Agglomerative methods are used more often than Divisive methods 4/17/2020 DR ATHAR KHAN 19
  • 21. Hierarchical agglomerative methods Within this approach to cluster analysis there are a number of different methods used to determine which clusters should be joined at each stage. Linkage Function/Creating the Clusters 4/17/2020 DR ATHAR KHAN 21
  • 22. Nearest neighbour method (single linkage method) In this method the distance between two clusters is defined to be the distance between the two closest members, or neighbours. Furthest neighbour method (complete linkage method) In this case the distance between two clusters is defined to be the maximum distance between members — i.e. the distance between the two subjects that are furthest apart. 4/17/2020 DR ATHAR KHAN 22
  • 23. Average (between groups) linkage method (sometimes referred to as UPGMA) The distance between two clusters is calculated as the average distance between all pairs of subjects in the two clusters. Centroid Method Here the centroid (mean value for each variable) of each cluster is calculated and the distance between centroids is used. Clusters whose centroids are closest together are merged. 4/17/2020 DR ATHAR KHAN 23
  • 24. Ward’s Method ▪ In this method all possible pairs of clusters are combined and the sum of the squared distances within each cluster is calculated. ▪ This is then summed over all clusters. ▪ The combination that gives the lowest sum of squares is chosen. ▪ The aim in Ward’s method is to join cases into clusters such that the variance within a cluster is minimised. ▪ To be more precise, two clusters are merged if this merger results in the minimum increase in the error sum of squares. ▪ Most popular Method 4/17/2020 DR ATHAR KHAN 24
  • 25. Selecting the optimum number of clusters ▪ Once the cluster analysis has been carried out it is then necessary to select the ’best’ cluster solution. ▪ # of clusters and within cluster variances 4/17/2020 DR ATHAR KHAN 25
  • 26. Dendrogram 1 2 34 In the dendrogram above, the height of the dendrogram indicates the order in which the clusters were joined. Dendrograms cannot tell you how many clusters you should have4/17/2020 DR ATHAR KHAN 26
  • 27. Data Preparation • To perform a cluster analysis, generally, the data should be prepared as follows: • Any missing value in the data must be removed or estimated. • The data must be standardized(Z SCORES) 4/17/2020 DR ATHAR KHAN 27
  • 28. Limitations of Cluster Analysis • There are several things to be aware of when conducting cluster analysis: – The different methods of clustering usually give very different results. This occurs because of the different criterion for merging clusters (including cases). It is important to think carefully about which method is best for what you are interested in looking at. – With the exception of simple linkage, the results will be affected by the way in which the variables are ordered. – The analysis is not stable when cases are dropped: this occurs because selection of a case (or merger of clusters) depends on similarity of one case to the cluster. 4/17/2020 DR ATHAR KHAN 28
  • 29. Limitations of Cluster Analysis • Imagine we wanted to look at clusters of cases referred for psychiatric treatment. • We measured each subject on four questionnaires: Spielberger Trait Anxiety Inventory (STAI), the Beck Depression Inventory (BDI), a measure of Intrusive Thoughts and Rumination (IT) and a measure of Impulsive Thoughts and Actions (Impulse). • The rationale behind this analysis is that people with the same disorder should report a similar pattern of scores across the measures (so the profiles of their responses should be similar) 4/17/2020 DR ATHAR KHAN 29
  • 30. Video : Hierarchical Clustering : Agglomerative Clustering and Divisive Clustering https://www.youtube.com/watch?v=7enWesSofhg 4/17/2020 DR ATHAR KHAN 30
  • 36. Agglomeration schedule: Shows how the clusters are combined at each stage. Stage 1: Cases 1 and 4 have the smallest distance ("Coefficients" = .168) => first cluster {1,4} Stage 2: Cases 10 and 12 have the second smallest distance => second cluster {10,12}4/17/2020 DR ATHAR KHAN 36
  • 37. STAGE 1 STAGE 7 STAGE 3 STAGE 4 STAGE 5 STAGE 2 STAGE 6 4/17/2020 DR ATHAR KHAN 37
  • 38. Agglomeration schedule: Shows how the clusters are combined at each stage. The next part of the table shows the stage at which each cluster first appears. 4/17/2020 DR ATHAR KHAN 38
  • 39. Agglomeration schedule: Shows how the clusters are combined at each stage. In stage 6, cluster 1 is the cluster that was formed in stage 1... 4/17/2020 DR ATHAR KHAN 39
  • 40. Agglomeration schedule: Shows how the clusters are combined at each stage. Stage 1: Cases 1 and 4 have the smallest distance ("Coefficients" = .168) => first cluster {1,4} First cluster {1,4} is merged with case 13 in stage 6 ("Next Stage") => Cluster {1,4,13} 0 means first time 4/17/2020 DR ATHAR KHAN 40
  • 41. STAGE 1 STAGE 2 STAGE 5 4/17/2020 DR ATHAR KHAN 41
  • 42. ▪ The Coefficients column indicates the distance between the two clusters (or cases) joined at each stage. ▪ The values here depend on the proximity measure and linkage method used in the analysis. ▪ For a good cluster solution, you will see a sudden jump in the distance coefficient as you read down the table. ▪ The stage before the sudden change indicates the optimal stopping point for merging clusters. 3 clusters 2 Clusters 1 Cluster 4/17/2020 DR ATHAR KHAN 42
  • 43. NUMBER OF CLUSTERS ▪ Number of cases 15 ▪ Step of ‘elbow’ 12 15 – 12 Number of clusters 3 4/17/2020 DR ATHAR KHAN 43
  • 45. Scree Plot .000 2.000 4.000 6.000 8.000 10.000 12.000 14.000 16.000 18.000 20.000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 4/17/2020 DR ATHAR KHAN 45
  • 46. ▪ Notice how the "branches" merge together as you look from left to right in the dendrogram. ▪ Cases or clusters that are joined by lines "further down" the tree (near the left side of the dendrogram) are very similar. The dendrogram (or "tree diagram") shows relative similarities between cases. 4/17/2020 DR ATHAR KHAN 46
  • 47. ▪ Cases or clusters that are joined by lines "further up" the tree (near the right side) are dissimilar. ▪ Cluster distances are rescaled so that they range from 0 to 25 in this plot. 4/17/2020 DR ATHAR KHAN 47
  • 48. ▪ This would identify 3 clusters (GREEN), one for each point where a branch intersects our line. ▪ By considering different cut points for our line, we can get solutions with different numbers of cluster. ▪ A good cluster solution is one with small within-cluster distances, but large between cluster distances. 1 2 3 4/17/2020 DR ATHAR KHAN 48
  • 49. ▪ Choose the number of clusters within the largest increase in heterogeneity. 1 2 3 Standardized distance 4/17/2020 DR ATHAR KHAN 49
  • 50. ▪ This table shows cluster membership for each case, according to the number of clusters you requested. ▪ You can attempt to interpret the clusters by observing which cases are grouped together. 4/17/2020 DR ATHAR KHAN 50
  • 51. ▪ This table shows cluster membership for each case, according to the number of clusters you requested. ▪ You can attempt to interpret the clusters by observing which cases are grouped together. 4/17/2020 DR ATHAR KHAN 51
  • 53. ▪ Having eyeballed the dendrogram and decided how many clusters are present it is possible to re-run the analysis asking SPSS to save a new variable in which cluster codes are assigned to cases (with the researcher specifying the number of clusters in the data). ▪ For these data, we saw three clear clusters and so we could re- run the analysis asking for cluster group codings for three clusters (in fact, I told you to do this as part of the original analysis). ▪ The output below shows the resulting codes for each case in this analysis. It’s pretty clear that these codes map exactly onto the DSM-IV classifications. 4/17/2020 DR ATHAR KHAN 53
  • 54. ▪ This table shows cluster membership for each case, according to the number of clusters you requested. ▪ You can attempt to interpret the clusters by observing which cases are grouped together. 4/17/2020 DR ATHAR KHAN 54
  • 55. 4/17/2020 DR ATHAR KHAN 55 DR ATHAR KHAN MBBS, MCPS, DPH, DCPS-HCSM, DCPS-HPE, MBA, PGD- STATISTICS, CCRP ASSOCIATE PROFESSOR DEPARTMENT OF COMMUNITY MEDICINE LIAQUAT COLLEGE OF MEDICINE & DENTISTRY KARACHI, PAKISTAN 0092-3232135932