SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Jan Jantzen, DTU
1
Tutorial On Fuzzy Clustering
Jan Jantzen
Technical University of Denmark
jj@oersted.dtu.dk
Abstract
uProblem: To extract rules from data
uMethod: Fuzzy c-means
uResults: e.g., finding cancer cells
Jan Jantzen, DTU
2
Cluster (www.m-w.com)
uA number of similar individuals that
occur together as a: two or more
consecutive consonants or vowels in a
segment of speech b: a group of
houses (...) c: an aggregation of stars or
galaxies that appear close together in
the sky and are gravitationally
associated.
Cluster analysis (www.m-w.com)
uA statistical classification technique for
discovering whether the individuals of a
population fall into different groups by
making quantitative comparisons of
multiple characteristics.
Jan Jantzen, DTU
3
Vehicle Example
Vehicle Top speed
km/h
Colour Air
resistance
Weight
Kg
V1 220 red 0.30 1300
V2 230 black 0.32 1400
V3 260 red 0.29 1500
V4 140 gray 0.35 800
V5 155 blue 0.33 950
V6 130 white 0.40 600
V7 100 black 0.50 3000
V8 105 red 0.60 2500
V9 110 gray 0.55 3500
Vehicle Clusters
100 150 200 250 300
500
1000
1500
2000
2500
3000
3500
Top speed [km/h]
Weight[kg]
Sports cars
Medium market cars
Lorries
Jan Jantzen, DTU
4
Terminology
100 150 200 250 300
500
1000
1500
2000
2500
3000
3500
Top speed [km/h]
Weight[kg]
Sports cars
Medium market cars
Lorries
Object or data point
feature
feature space
cluster
feature
label
Example: Classify cracked tiles
Jan Jantzen, DTU
5
475Hz 557Hz Ok?
-----+-----+---
0.958 0.003 Yes
1.043 0.001 Yes
1.907 0.003 Yes
0.780 0.002 Yes
0.579 0.001 Yes
0.003 0.105 No
0.001 1.748 No
0.014 1.839 No
0.007 1.021 No
0.004 0.214 No
Table 1: frequency
intensities for ten
tiles.
Tiles are made from clay moulded into the right shape, brushed, glazed, and
baked. Unfortunately, the baking may produce invisible cracks. Operators can
detect the cracks by hitting the tiles with a hammer, and in an automated system
the response is recorded with a microphone, filtered, Fourier transformed, and
normalised. A small set of data is given in TABLE 1 (adapted from MIT, 1997).
Algorithm: hard c-means (HCM)
(also known as k means)
Jan Jantzen, DTU
6
Plot of tiles by frequencies (logarithms). The whole tiles (o) seem well
separated from the cracked tiles (*). The objective is to find the two
clusters.
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
1. Place two cluster centres (x) at random.
2. Assign each data point (* and o) to the nearest cluster centre (x)
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
7
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
1. Compute the new centre of each class
2. Move the crosses (x)
Iteration 2
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
8
Iteration 3
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Iteration 4 (then stop, because no visible change)
Each data point belongs to the cluster defined by the nearest centre
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
9
The membership matrix M:
1. The last five data points (rows) belong to the first cluster (column)
2. The first five data points (rows) belong to the second cluster (column)
M =
0.0000 1.0000
0.0000 1.0000
0.0000 1.0000
0.0000 1.0000
0.0000 1.0000
1.0000 0.0000
1.0000 0.0000
1.0000 0.0000
1.0000 0.0000
1.0000 0.0000
Membership matrix M



 −≤−
=
otherwise
if
m jkik
ik
0
1
22
cucu
data point k cluster centre i
distance
cluster centre j
Jan Jantzen, DTU
10
c-partition
Kc
iallforUCØ
jiallforØCC
UC
i
ji
c
i
i
≤≤
⊂⊂
≠=∩
=
=
2
1
U
All clusters C
together fills the
whole universe U
Clusters do not
overlap
A cluster C is never
empty and it is
smaller than the
whole universe U
There must be at least 2
clusters in a c-partition and
at most as many as the
number of data points K
Objective function
∑ ∑∑
= ∈=








−==
c
i Ck
ik
c
i
i
ik
JJ
1
2
,1 u
cu
Minimise the total sum of
all distances
Jan Jantzen, DTU
11
Algorithm: fuzzy c-means (FCM)
Each data point belongs to two clusters to different degrees
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
12
1. Place two cluster centres
2. Assign a fuzzy membership to each data point depending on
distance
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
1. Compute the new centre of each class
2. Move the crosses (x)
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
13
Iteration 2
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Iteration 5
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
14
Iteration 10
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Iteration 13 (then stop, because no visible change)
Each data point belongs to the two clusters to a degree
-8 -6 -4 -2 0 2
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
log(intensity) 475 Hz
log(intensity)557Hz
Tiles data: o = whole tiles, * = cracked tiles, x = centres
Jan Jantzen, DTU
15
The membership matrix M:
1. The last five data points (rows) belong mostly to the first cluster (column)
2. The first five data points (rows) belong mostly to the second cluster (column)
M =
0.0025 0.9975
0.0091 0.9909
0.0129 0.9871
0.0001 0.9999
0.0107 0.9893
0.9393 0.0607
0.9638 0.0362
0.9574 0.0426
0.9906 0.0094
0.9807 0.0193
Fuzzy membership matrix M
( )
∑
=
−








=
c
j
q
jk
ik
ik
d
d
m
1
1/2
1
ikikd cu −=
Distance from point k to
current cluster centre i
Distance from point k to
other cluster centres j
Point k’s membership
of cluster i
Fuzziness
exponent
Jan Jantzen, DTU
16
Fuzzy membership matrix M
ikm ( )
( ) ( ) ( )
( )
( ) ( ) ( )1/21/2
2
1/2
1
1/2
1/21/2
2
1/2
1
1
1/2
111
1
1
1
−−−
−
−−−
=
−
+++
=






++





+





=








=
∑
q
ck
q
k
q
k
q
ik
q
ck
ik
q
k
ik
q
k
ik
c
j
q
jk
ik
ddd
d
d
d
d
d
d
d
d
d
L
L
Gravitation to
cluster i relative
to total gravitation
Electrical Analogy
R1 R2
i1 i2
U
I
I
i
i
UI
U
R
R
RRR
R
R
R
RRR
R
RIU
i
i
i
c
i
i
c
==
+++
=
+++
=
=
11
111
1
1
111
1
21
21
L
L Same form as
mik
Jan Jantzen, DTU
17
Fuzzy Membership
1 2 3 4 5
0
0.5
1
Cluster centres
Membershipoftestpoint
o is with q = 1.1, * is with q = 2
Data point
Fuzzy c-partition
Kc
iallforUCØ
jiallforØCC
UC
i
ji
c
i
i
≤≤
⊂⊂
≠=∩
=
=
2
1
U
All clusters C together fill the
whole universe U.
Remark: The sum of
memberships for a data point
is 1, and the total for all
points is K
Not valid: Clusters
do overlap
A cluster C is never
empty and it is
smaller than the
whole universe U
There must be at least 2
clusters in a c-partition and
at most as many as the
number of data points K
Jan Jantzen, DTU
18
Example: Classify cancer cells
Normal smear Severely dysplastic smear
Using a small brush, cotton stick, or wooden
stick, a specimen is taken from the uterincervix
and smeared onto a thin, rectangular glass plate,
a slide. The purpose of the smear screening is to
diagnose pre-malignant cell changes before they
progress to cancer. The smear is stained using
the Papanicolau method, hence the name Pap
smear. Different characteristics have different
colours, easy to distinguish in a microscope. A
cyto-technician performs the screening in a
microscope. It is time consuming and prone to
error, as each slide may contain up to 300.000
cells.
Dysplastic cells have undergone precancerous changes.
They generally have longer and darker nuclei, and they
have a tendency to cling together in large clusters. Mildly
dysplastic cels have enlarged and bright nuclei.
Moderately dysplastic cells have larger and darker
nuclei. Severely dysplastic cells have large, dark, and
often oddly shaped nuclei. The cytoplasm is dark, and it
is relatively small.
Possible Features
uNucleus and cytoplasm area
uNucleus and cyto brightness
uNucleus shortest and longest diameter
uCyto shortest and longest diameter
uNucleus and cyto perimeter
uNucleus and cyto no of maxima
u(...)
Jan Jantzen, DTU
19
Classes are nonseparable
Hard Classifier (HCM)
Ok light
moderate
severeOk
A cell is either one
or the other class
defined by a colour.
Jan Jantzen, DTU
20
Fuzzy Classifier (FCM)
Ok light
moderate
severeOk
A cell can belong to
several classes to a
Degree, i.e., one column
may have several colours.
Function approximation
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-1.5
-1
-0.5
0
0.5
1
1.5
Input
Output1
Curve fitting in a multi-dimensional space is also called function
approximation. Learning is equivalent to finding a function that best
fits the training data.
Jan Jantzen, DTU
21
Approximation by fuzzy sets
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-2
-1
0
1
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
Procedure to find a model
1. Acquire data
2. Select structure
3. Find clusters, generate model
4. Validate model
Jan Jantzen, DTU
22
Conclusions
uCompared to neural networks, fuzzy
models can be interpreted by human
beings
uApplications: system identification,
adaptive systems
Links
u J. Jantzen: Neurofuzzy Modelling. Technical University of Denmark:
Oersted-DTU, Tech report no 98-H-874 (nfmod), 1998. URL
http://fuzzy.iau.dtu.dk/download/nfmod.pdf
u PapSmear tutorial. URL http://fuzzy.iau.dtu.dk/smear/
u U. Kaymak: Data Driven Fuzzy Modelling. PowerPoint, URL
http://fuzzy.iau.dtu.dk/tutor/ddfm.htm
Jan Jantzen, DTU
23
Exercise: fuzzy clustering (Matlab)
u Download and follow the instructions in this text file:
http://fuzzy.iau.dtu.dk/tutor/fcm/exerF5.txt
u The exercise requires Matlab (no special toolboxes
are required)

Weitere ähnliche Inhalte

Was ist angesagt?

Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function InterpolationJesse Bettencourt
 
The Gaussian Process Latent Variable Model (GPLVM)
The Gaussian Process Latent Variable Model (GPLVM)The Gaussian Process Latent Variable Model (GPLVM)
The Gaussian Process Latent Variable Model (GPLVM)James McMurray
 
Lecture 9 (Digital Image Processing)
Lecture 9 (Digital Image Processing)Lecture 9 (Digital Image Processing)
Lecture 9 (Digital Image Processing)VARUN KUMAR
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Self Organizing Maps: Fundamentals
Self Organizing Maps: FundamentalsSelf Organizing Maps: Fundamentals
Self Organizing Maps: FundamentalsSpacetoshare
 
Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)VARUN KUMAR
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...ArchiLab 7
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Yan Xu
 
Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Con LA
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Mostafa G. M. Mostafa
 
Paper id 21201488
Paper id 21201488Paper id 21201488
Paper id 21201488IJRAT
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basisJAEMINJEONG5
 
Perimetric Complexity of Binary Digital Images
Perimetric Complexity of Binary Digital ImagesPerimetric Complexity of Binary Digital Images
Perimetric Complexity of Binary Digital ImagesRSARANYADEVI
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationVARUN KUMAR
 
Image recogonization
Image recogonizationImage recogonization
Image recogonizationSANTOSH RATH
 

Was ist angesagt? (20)

Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Dycops2019
Dycops2019 Dycops2019
Dycops2019
 
The Gaussian Process Latent Variable Model (GPLVM)
The Gaussian Process Latent Variable Model (GPLVM)The Gaussian Process Latent Variable Model (GPLVM)
The Gaussian Process Latent Variable Model (GPLVM)
 
Lecture 9 (Digital Image Processing)
Lecture 9 (Digital Image Processing)Lecture 9 (Digital Image Processing)
Lecture 9 (Digital Image Processing)
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Self Organizing Maps: Fundamentals
Self Organizing Maps: FundamentalsSelf Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals
 
Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)Lecture 8 (Stereo imaging) (Digital Image Processing)
Lecture 8 (Stereo imaging) (Digital Image Processing)
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal Fultz
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
Paper id 21201488
Paper id 21201488Paper id 21201488
Paper id 21201488
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basis
 
Perimetric Complexity of Binary Digital Images
Perimetric Complexity of Binary Digital ImagesPerimetric Complexity of Binary Digital Images
Perimetric Complexity of Binary Digital Images
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
 
Image recogonization
Image recogonizationImage recogonization
Image recogonization
 

Andere mochten auch

Clustering
ClusteringClustering
Clusteringbutest
 
Machine learning hands on clustering
Machine learning hands on clusteringMachine learning hands on clustering
Machine learning hands on clusteringDr. Dragos Crintea
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clusteringNadeem Oozeer
 
Machine Learning and Data Mining: 06 Clustering: Introduction
Machine Learning and Data Mining: 06 Clustering: IntroductionMachine Learning and Data Mining: 06 Clustering: Introduction
Machine Learning and Data Mining: 06 Clustering: IntroductionPier Luca Lanzi
 
Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101John Ternent
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
Machine Learning and Data Mining: 06 Clustering: Partitioning
Machine Learning and Data Mining: 06 Clustering: PartitioningMachine Learning and Data Mining: 06 Clustering: Partitioning
Machine Learning and Data Mining: 06 Clustering: PartitioningPier Luca Lanzi
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 

Andere mochten auch (8)

Clustering
ClusteringClustering
Clustering
 
Machine learning hands on clustering
Machine learning hands on clusteringMachine learning hands on clustering
Machine learning hands on clustering
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Machine Learning and Data Mining: 06 Clustering: Introduction
Machine Learning and Data Mining: 06 Clustering: IntroductionMachine Learning and Data Mining: 06 Clustering: Introduction
Machine Learning and Data Mining: 06 Clustering: Introduction
 
Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101Mahout and Distributed Machine Learning 101
Mahout and Distributed Machine Learning 101
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
Machine Learning and Data Mining: 06 Clustering: Partitioning
Machine Learning and Data Mining: 06 Clustering: PartitioningMachine Learning and Data Mining: 06 Clustering: Partitioning
Machine Learning and Data Mining: 06 Clustering: Partitioning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Ähnlich wie Clustering tutorial

Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...jedt_journal
 
Blind separation of complex-valued satellite-AIS data for marine surveillance...
Blind separation of complex-valued satellite-AIS data for marine surveillance...Blind separation of complex-valued satellite-AIS data for marine surveillance...
Blind separation of complex-valued satellite-AIS data for marine surveillance...IJECEIAES
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsSciCompIIT
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Alexander Litvinenko
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
Transport and routing on coupled spatial networks
Transport and routing on coupled spatial networksTransport and routing on coupled spatial networks
Transport and routing on coupled spatial networksrichardgmorris
 
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSijcsitcejournal
 
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...ijcsitcejournal
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Computing f-Divergences and Distances of High-Dimensional Probability Density...
Computing f-Divergences and Distances of High-Dimensional Probability Density...Computing f-Divergences and Distances of High-Dimensional Probability Density...
Computing f-Divergences and Distances of High-Dimensional Probability Density...Alexander Litvinenko
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning projectLianli Liu
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Alexander Litvinenko
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
 
Chemical dynamics and rare events in soft matter physics
Chemical dynamics and rare events in soft matter physicsChemical dynamics and rare events in soft matter physics
Chemical dynamics and rare events in soft matter physicsBoris Fackovec
 

Ähnlich wie Clustering tutorial (20)

Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...
ON OPTIMIZATION OF MANUFACTURING OF FIELD-EFFECT HETERO TRANSISTORS A THREE S...
 
Blind separation of complex-valued satellite-AIS data for marine surveillance...
Blind separation of complex-valued satellite-AIS data for marine surveillance...Blind separation of complex-valued satellite-AIS data for marine surveillance...
Blind separation of complex-valued satellite-AIS data for marine surveillance...
 
Mechanica
MechanicaMechanica
Mechanica
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular Dynamics
 
main
mainmain
main
 
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
Possible applications of low-rank tensors in statistics and UQ (my talk in Bo...
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Transport and routing on coupled spatial networks
Transport and routing on coupled spatial networksTransport and routing on coupled spatial networks
Transport and routing on coupled spatial networks
 
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
 
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...
On optimization ofON OPTIMIZATION OF DOPING OF A HETEROSTRUCTURE DURING MANUF...
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Computing f-Divergences and Distances of High-Dimensional Probability Density...
Computing f-Divergences and Distances of High-Dimensional Probability Density...Computing f-Divergences and Distances of High-Dimensional Probability Density...
Computing f-Divergences and Distances of High-Dimensional Probability Density...
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning project
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Mobile_Lec5
Mobile_Lec5Mobile_Lec5
Mobile_Lec5
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Chemical dynamics and rare events in soft matter physics
Chemical dynamics and rare events in soft matter physicsChemical dynamics and rare events in soft matter physics
Chemical dynamics and rare events in soft matter physics
 

Kürzlich hochgeladen

KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...Any kyc Account
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfAmzadHosen3
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 

Kürzlich hochgeladen (20)

KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
 
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsMifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 

Clustering tutorial

  • 1. Jan Jantzen, DTU 1 Tutorial On Fuzzy Clustering Jan Jantzen Technical University of Denmark jj@oersted.dtu.dk Abstract uProblem: To extract rules from data uMethod: Fuzzy c-means uResults: e.g., finding cancer cells
  • 2. Jan Jantzen, DTU 2 Cluster (www.m-w.com) uA number of similar individuals that occur together as a: two or more consecutive consonants or vowels in a segment of speech b: a group of houses (...) c: an aggregation of stars or galaxies that appear close together in the sky and are gravitationally associated. Cluster analysis (www.m-w.com) uA statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics.
  • 3. Jan Jantzen, DTU 3 Vehicle Example Vehicle Top speed km/h Colour Air resistance Weight Kg V1 220 red 0.30 1300 V2 230 black 0.32 1400 V3 260 red 0.29 1500 V4 140 gray 0.35 800 V5 155 blue 0.33 950 V6 130 white 0.40 600 V7 100 black 0.50 3000 V8 105 red 0.60 2500 V9 110 gray 0.55 3500 Vehicle Clusters 100 150 200 250 300 500 1000 1500 2000 2500 3000 3500 Top speed [km/h] Weight[kg] Sports cars Medium market cars Lorries
  • 4. Jan Jantzen, DTU 4 Terminology 100 150 200 250 300 500 1000 1500 2000 2500 3000 3500 Top speed [km/h] Weight[kg] Sports cars Medium market cars Lorries Object or data point feature feature space cluster feature label Example: Classify cracked tiles
  • 5. Jan Jantzen, DTU 5 475Hz 557Hz Ok? -----+-----+--- 0.958 0.003 Yes 1.043 0.001 Yes 1.907 0.003 Yes 0.780 0.002 Yes 0.579 0.001 Yes 0.003 0.105 No 0.001 1.748 No 0.014 1.839 No 0.007 1.021 No 0.004 0.214 No Table 1: frequency intensities for ten tiles. Tiles are made from clay moulded into the right shape, brushed, glazed, and baked. Unfortunately, the baking may produce invisible cracks. Operators can detect the cracks by hitting the tiles with a hammer, and in an automated system the response is recorded with a microphone, filtered, Fourier transformed, and normalised. A small set of data is given in TABLE 1 (adapted from MIT, 1997). Algorithm: hard c-means (HCM) (also known as k means)
  • 6. Jan Jantzen, DTU 6 Plot of tiles by frequencies (logarithms). The whole tiles (o) seem well separated from the cracked tiles (*). The objective is to find the two clusters. -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres 1. Place two cluster centres (x) at random. 2. Assign each data point (* and o) to the nearest cluster centre (x) -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 7. Jan Jantzen, DTU 7 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres 1. Compute the new centre of each class 2. Move the crosses (x) Iteration 2 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 8. Jan Jantzen, DTU 8 Iteration 3 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres Iteration 4 (then stop, because no visible change) Each data point belongs to the cluster defined by the nearest centre -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 9. Jan Jantzen, DTU 9 The membership matrix M: 1. The last five data points (rows) belong to the first cluster (column) 2. The first five data points (rows) belong to the second cluster (column) M = 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 Membership matrix M     −≤− = otherwise if m jkik ik 0 1 22 cucu data point k cluster centre i distance cluster centre j
  • 10. Jan Jantzen, DTU 10 c-partition Kc iallforUCØ jiallforØCC UC i ji c i i ≤≤ ⊂⊂ ≠=∩ = = 2 1 U All clusters C together fills the whole universe U Clusters do not overlap A cluster C is never empty and it is smaller than the whole universe U There must be at least 2 clusters in a c-partition and at most as many as the number of data points K Objective function ∑ ∑∑ = ∈=         −== c i Ck ik c i i ik JJ 1 2 ,1 u cu Minimise the total sum of all distances
  • 11. Jan Jantzen, DTU 11 Algorithm: fuzzy c-means (FCM) Each data point belongs to two clusters to different degrees -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 12. Jan Jantzen, DTU 12 1. Place two cluster centres 2. Assign a fuzzy membership to each data point depending on distance -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres 1. Compute the new centre of each class 2. Move the crosses (x) -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 13. Jan Jantzen, DTU 13 Iteration 2 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres Iteration 5 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 14. Jan Jantzen, DTU 14 Iteration 10 -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres Iteration 13 (then stop, because no visible change) Each data point belongs to the two clusters to a degree -8 -6 -4 -2 0 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(intensity) 475 Hz log(intensity)557Hz Tiles data: o = whole tiles, * = cracked tiles, x = centres
  • 15. Jan Jantzen, DTU 15 The membership matrix M: 1. The last five data points (rows) belong mostly to the first cluster (column) 2. The first five data points (rows) belong mostly to the second cluster (column) M = 0.0025 0.9975 0.0091 0.9909 0.0129 0.9871 0.0001 0.9999 0.0107 0.9893 0.9393 0.0607 0.9638 0.0362 0.9574 0.0426 0.9906 0.0094 0.9807 0.0193 Fuzzy membership matrix M ( ) ∑ = −         = c j q jk ik ik d d m 1 1/2 1 ikikd cu −= Distance from point k to current cluster centre i Distance from point k to other cluster centres j Point k’s membership of cluster i Fuzziness exponent
  • 16. Jan Jantzen, DTU 16 Fuzzy membership matrix M ikm ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )1/21/2 2 1/2 1 1/2 1/21/2 2 1/2 1 1 1/2 111 1 1 1 −−− − −−− = − +++ =       ++      +      =         = ∑ q ck q k q k q ik q ck ik q k ik q k ik c j q jk ik ddd d d d d d d d d d L L Gravitation to cluster i relative to total gravitation Electrical Analogy R1 R2 i1 i2 U I I i i UI U R R RRR R R R RRR R RIU i i i c i i c == +++ = +++ = = 11 111 1 1 111 1 21 21 L L Same form as mik
  • 17. Jan Jantzen, DTU 17 Fuzzy Membership 1 2 3 4 5 0 0.5 1 Cluster centres Membershipoftestpoint o is with q = 1.1, * is with q = 2 Data point Fuzzy c-partition Kc iallforUCØ jiallforØCC UC i ji c i i ≤≤ ⊂⊂ ≠=∩ = = 2 1 U All clusters C together fill the whole universe U. Remark: The sum of memberships for a data point is 1, and the total for all points is K Not valid: Clusters do overlap A cluster C is never empty and it is smaller than the whole universe U There must be at least 2 clusters in a c-partition and at most as many as the number of data points K
  • 18. Jan Jantzen, DTU 18 Example: Classify cancer cells Normal smear Severely dysplastic smear Using a small brush, cotton stick, or wooden stick, a specimen is taken from the uterincervix and smeared onto a thin, rectangular glass plate, a slide. The purpose of the smear screening is to diagnose pre-malignant cell changes before they progress to cancer. The smear is stained using the Papanicolau method, hence the name Pap smear. Different characteristics have different colours, easy to distinguish in a microscope. A cyto-technician performs the screening in a microscope. It is time consuming and prone to error, as each slide may contain up to 300.000 cells. Dysplastic cells have undergone precancerous changes. They generally have longer and darker nuclei, and they have a tendency to cling together in large clusters. Mildly dysplastic cels have enlarged and bright nuclei. Moderately dysplastic cells have larger and darker nuclei. Severely dysplastic cells have large, dark, and often oddly shaped nuclei. The cytoplasm is dark, and it is relatively small. Possible Features uNucleus and cytoplasm area uNucleus and cyto brightness uNucleus shortest and longest diameter uCyto shortest and longest diameter uNucleus and cyto perimeter uNucleus and cyto no of maxima u(...)
  • 19. Jan Jantzen, DTU 19 Classes are nonseparable Hard Classifier (HCM) Ok light moderate severeOk A cell is either one or the other class defined by a colour.
  • 20. Jan Jantzen, DTU 20 Fuzzy Classifier (FCM) Ok light moderate severeOk A cell can belong to several classes to a Degree, i.e., one column may have several colours. Function approximation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -1.5 -1 -0.5 0 0.5 1 1.5 Input Output1 Curve fitting in a multi-dimensional space is also called function approximation. Learning is equivalent to finding a function that best fits the training data.
  • 21. Jan Jantzen, DTU 21 Approximation by fuzzy sets 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -2 -1 0 1 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 Procedure to find a model 1. Acquire data 2. Select structure 3. Find clusters, generate model 4. Validate model
  • 22. Jan Jantzen, DTU 22 Conclusions uCompared to neural networks, fuzzy models can be interpreted by human beings uApplications: system identification, adaptive systems Links u J. Jantzen: Neurofuzzy Modelling. Technical University of Denmark: Oersted-DTU, Tech report no 98-H-874 (nfmod), 1998. URL http://fuzzy.iau.dtu.dk/download/nfmod.pdf u PapSmear tutorial. URL http://fuzzy.iau.dtu.dk/smear/ u U. Kaymak: Data Driven Fuzzy Modelling. PowerPoint, URL http://fuzzy.iau.dtu.dk/tutor/ddfm.htm
  • 23. Jan Jantzen, DTU 23 Exercise: fuzzy clustering (Matlab) u Download and follow the instructions in this text file: http://fuzzy.iau.dtu.dk/tutor/fcm/exerF5.txt u The exercise requires Matlab (no special toolboxes are required)