This document provides an overview of knowledge-based clustering (KBC). It begins with an introduction to KBC and lists some common types, including fuzzy clustering, conditional fuzzy clustering, clustering with partial supervision, collaborative clustering, and directional clustering. The document then discusses key concepts in more detail, such as fuzzy clustering and how it helped establish KBC. It provides examples and mathematical formulations for different KBC techniques. Overall, the document presents KBC as a human-centric approach to clustering that incorporates domain knowledge compared to more conventional data-centric clustering methods.
Web & Social Media Analytics Previous Year Question Paper.pdf
Knowledge Based Clustering
1. Knowledge Based
Clustering
An Intelligent way to find groups in your data
2. Contents
Knowledge-Based Clustering (KBC)
Fuzzy Clustering and FCM
Conditional Fuzzy Clustering and CFCM
Clustering With Partial Supervision
Collaborative Clustering
Directional Clustering
Fuzzy Relational Clustering
Christos N. Zigkolis Aristotle University of Thessaloniki 2
3. Some reasonable questions…
What type of clustering is the KBC?
“Partitional Clustering”
What are the differences from the
“conventional” clustering?
“Data-Centric VS Human-Centric”
What are the basic concepts of KBC?
“Information Granules, Fuzzy Clustering,
Objective Function-Based Techniques”
Christos N. Zigkolis Aristotle University of Thessaloniki 3
4. Data Clustering
Partitional
Hierarchical
Clustering – PC
Clustering – HC
(Agglomerative HC)
Hard Clustering
Soft Clustering
(K-Means)
Data-Centric
Approaches
Fuzzy Clustering – FC
(Fuzzy C-Means)
------------------------------------------------------------------
Knowledge Based Clustering
Human-Centric
Approaches
Christos N. Zigkolis Aristotle University of Thessaloniki 4
5. Objective Function-Based
Clustering Techniques
minor max(obj_function) => better clustering
To formulate an objective function that
is capable of reflecting the nature of
Our GOAL is:the problem so that its min() or max()
reveals a meaningful structure in the
dataset.
Christos N. Zigkolis Aristotle University of Thessaloniki 5
6. Fuzzy Clustering
“The Big Bang for KBC”
Binary Character of Partitions
0 || 1
VS
Fuzzy Logic – Partial Membership
[0, 1]
Christos N. Zigkolis Aristotle University of Thessaloniki 6
7. Fuzzy Clustering (2)
“The Big Bang for KBC”
K Means + Fuzzy Logic = Fuzzy C Means
“Yet another clustering procedure…What is so
special about it?” Can deal with patterns with
borderline character contrary to K-Means
[prototypes, U] = fcm( X_data, C)
Christos N. Zigkolis Aristotle University of Thessaloniki 7
9. Fuzzy Clustering (3)
“The Big Bang for KBC”
Input
• X_data [Nxp]
Iterative Process
• m : fuzzification coefficient >= 1
1. Compute the prototypes
• C : number of clusters
2. Compute the U matrix
• initialized U[CxN] matrix
3. Compute the value of the
objective function and
Output
stop the process if this
• prototypes [Cxp] value is lower than a
criterion e
• U matrix [CxN]
Christos N. Zigkolis Aristotle University of Thessaloniki 9
10. “Stop talking and show us the maths”
N
∑ m
uij X j
(1) proti = j =1
N
∑ m
uij
j =1
Restriction
(2) uij = 1 C
∑u
2
C
X − proti =1
∑(
i =1 X − prot j
) ( m −1)
i =1
ij
C N 2
(3) Q = ∑∑ uij X j − proti
m
<e
i =1 j =1
Christos N. Zigkolis Aristotle University of Thessaloniki 10
11. Fuzzy Clustering (4)
“The Big Bang for KBC”
Examples
• Fuzzy c-Means Clustering of Incomplete Data
“Modified versions of standard FCM are applied for dealing
with data with missing feature values”
• FCM-Based Model Selection Algorithms for Determining the
Number of Clusters
“Determining the number of clusters in a given data set and a
new validity index for measuring the “goodness” of clustering”
Christos N. Zigkolis Aristotle University of Thessaloniki 11
12. Conditional Fuzzy Clustering
“The presence of the aside information”
FROM
UNSUPERVISED LEARNING
TO
SEMI-SUPERVISED LEARNING
We mark our patterns according to a condition and these
marks are the aside information which can guide our
clustering process to give more meaningful results.
Christos N. Zigkolis Aristotle University of Thessaloniki 12
13. Conditional Fuzzy Clustering
“The presence of the aside information”
(1)
Xdata [N x p] Condition(s) (2)
Zk [1 x N] (Patterns’ Marks)
(3)
Scaling Function
(4)
Fk [1 x N] (Scaled patterns’ marks)
(5)
[prototypes, U] = CFCM(Xdata, Fk, C)
Christos N. Zigkolis Aristotle University of Thessaloniki 13
14. Conditional Fuzzy Clustering(2)
“The presence of the aside information”
Formulation Differences from FCM
Restriction
C Fj
uij =
∑ uij = Fj =>
i =1
C
X − proti
∑ ( X − prot )
2
( m −1)
i =1 j
Christos N. Zigkolis Aristotle University of Thessaloniki 14
15. Conditional Fuzzy Clustering(3)
“The presence of the aside information”
Example
“Using CFCM to mine event-related brain dynamics”
by C.N. Zigkolis and N.A. Laskaris
“…a framework for mining event related dynamics based on Conditional
FCM (CFCM). CFCM enables prototyping in a principled manner. User-
defined constraints, which are imposed by the nature of experimental data
and/or dictated by the neuroscientist’s intuition, direct the process of
knowledge extraction and can robustify single-trial analysis…“
Christos N. Zigkolis Aristotle University of Thessaloniki 15
16. Clustering with Partial Supervision
“Label some, cluster all”
X = [X1, X2, ..., XN]
---------------------------------------------------------------------------
Labeled patterns Unlabeled patterns
Υ = [Υ1,..., ΥΜ] Z = [Z1,..., ZN-M]
---------------------------------------------------------------------------
'
X =Y∪Z
After labeling some patterns we start the clustering process
Christos N. Zigkolis Aristotle University of Thessaloniki 16
17. Clustering with Partial Supervision(2)
“Label some, cluster all”
How this labeling are going to help us?
• Labeling = Knowledge
• This Knowledge will guide the whole process
• The labeled patterns can be considered as a grid of anchor points with
which we get to the entire structure of the data set
What algorithmic changes do we need to include this
partial supervision to the clustering process?
• The knowledge has to be included in the objective function
• The formulation of prototypes and U matrix takes another form
Christos N. Zigkolis Aristotle University of Thessaloniki 17
18. Clustering with Partial Supervision(3)
“Label some, cluster all”
Problem Formulation
Extra Structures :
• b = [b1, b2, …, bN] the vector of labels, bi=0|1 indicates if a
pattern is labeled or not.
• F[CxN] = [fij] a partition matrix which contains the membership
values for labeled patterns. The columns that correspond to
unlabeled data have zero values.
•α nonnegative weight factor for setting up a suitable balance
between the supervised and unsupervised mode of learning
Christos N. Zigkolis Aristotle University of Thessaloniki 18
19. Clustering with Partial Supervision(4)
“Label some, cluster all”
Problem Formulation (cont..)
C N C N
2 2
Q = ∑ ∑ u X j − proti + α ∑∑ (uij − f ij ) bk X j − proti
m
ij
2
i =1 j =1 i =1 j =1
The extra term is the augmentation we need. It addresses the
effect of partial supervision
Christos N. Zigkolis Aristotle University of Thessaloniki 19
20. Clustering with Partial Supervision(5)
“Label some, cluster all”
Examples
• Handwritten Digits • Reliance? of a training set
Christos N. Zigkolis Aristotle University of Thessaloniki 20
21. Clustering with Partial Supervision(6)
“Label some, cluster all”
Real Example
• Partially Supervised Clustering for Image Segmentation
“This paper describes a new method (ssFCM) for
classification. The method is well suited to problems such as
the segmentation of Magnetic Resonance Images (MRI). A
small set of labeled pixels provides a clustering algorithm
with a form of partial supervision”
Christos N. Zigkolis Aristotle University of Thessaloniki 21
22. Collaborative Clustering
“All for one and one for all”
What if we have to deal with several data sets and we
are interested in revealing a global structure?
“The concept of collaboration : We process each data set separately
and we have a collaboration by exchanging information about the
individual results”
Why don’t we put everything in one data set and do our
job?
“The paradigm of different organizations with different databases. We
don’t have access to others’ sources but we appreciate any external
assistant information”
Christos N. Zigkolis Aristotle University of Thessaloniki 22
23. Collaborative Clustering(2)
“All for one and one for all”
Horizontal Collaborative Clustering
X[1],X[2],..,X[p] data sets
Same objects but in
different feature spaces
ex. Same patients in
different institute database
The collaboration / communication platform is based between
the individual partition matrices
Christos N. Zigkolis Aristotle University of Thessaloniki 23
24. Collaborative Clustering(3)
“All for one and one for all”
Horizontal Collaborative Clustering
• matrix of Connections : α[ii,jj] >= 0
• the higher the value the
stronger the collaboration between
subsets
• matrix α is not essentially
symmetric, α[ii, jj] ≠ α[jj, ii]
Christos N. Zigkolis Aristotle University of Thessaloniki 24
25. Collaborative Clustering(4)
“All for one and one for all”
Horizontal Collaborative Clustering
Problem Formulation
N C
2
Q [ii] = ∑ ∑
j=1 i=1
u m
ij [ii ] X j [ii ] − p r o i[ii ] +
p N C
2
∑
jj =1, jj ≠ ii
α [ii, jj ]∑∑ {uij [ii ] − uij [ jj ]} X j [ii ] − proi [ii ]
j =1 i =1
m
The second term makes the clustering based on the iith subset
“aware” of the other partitions. If the structures in data sets are
similar then the differences between U tend to be lower, and the
resulting structure becomes more similar
Christos N. Zigkolis Aristotle University of Thessaloniki 25
26. Collaborative Clustering(5)
“All for one and one for all”
Vertical Collaborative Clustering
X[1],X[2],..,X[p] different data sets
Same feature space, different objects
ex. Auditory evoked responses
3 conditions/datasets (attentive,
stimulation, spontaneous activity)
We have the collaboration /
communication at the level of the
prototypes
Christos N. Zigkolis Aristotle University of Thessaloniki 26
27. Collaborative Clustering(6)
“All for one and one for all”
Vertical Collaborative Clustering
Problem Formulation
N C
2
Q[ii ] = ∑∑ u [ii ] X j [ii ] − proti [ii ] +
m
ij
j =1 i =1
p N C
2
∑
jj =1, jj ≠ ii
β [ii, jj ]∑∑ u [ii ] proti [ii ] − proti [ jj ]
j =1 i =1
m
ij
The second term articulates the differences between the
prototypes
Christos N. Zigkolis Aristotle University of Thessaloniki 27
28. Collaborative Clustering(7)
“All for one and one for all”
The 2 algorithmic Phases of Collaborative clustering
PHASE 1
FCM to each data set number of clusters have to be the same for all
data sets.
// compute proti[ii], i=1,…,C and U[ii] for all subsets //
PHASE 2
Setting up the collaboration level and reach to an optimization
// compute α[ii, jj] (Horizontal Clust.) or β[ii, jj] (Vertical Clust.) and
optimize the partition matrices //
Christos N. Zigkolis Aristotle University of Thessaloniki 28
29. Collaborative Clustering(8)
“All for one and one for all”
A combination of Horizontal and Vertical clustering
The Objective Function will be
a combination of the objective
functions from Horizontal and
Vertical Clustering
Christos N. Zigkolis Aristotle University of Thessaloniki 29
30. Collaborative Clustering(9)
“All for one and one for all”
Consensus Clustering
• Different objects – Same feature space – Lack of interaction
• Clustering in the produced prototypes from each data set =
Meta – Clustering
• Different number of clusters C[1], C[2], …, C[p]
• Building meta-structure – A partition matrix in a higher level
• U at the higher level is formed on the basis of the
prototypes of the data sets
Christos N. Zigkolis Aristotle University of Thessaloniki 30
31. Collaborative Clustering(10)
“All for one and one for all”
Examples
• Semantic Content Analysis : A Study in Proximity-Based
Collaborative Clustering “clustering semantic web documents
under the collaboration of semantic and data view”
• Clustering in the framework of
collaborative agents
“…a model of collaborative clustering
(horizontal and vertical) realized over a
collection of data sets in which a
computing agent carries out an individual
clustering process”
Christos N. Zigkolis Aristotle University of Thessaloniki 31
32. Directional Clustering
“Direction except from relation”
X[1] and X[2] different data sets
• Our goal is to form a map
between the information
granules developed for these
two data sets.
• Clustering the data set X[1] is
the first step. Then cluster the
data set X[2] under 2 criteria.
1) Reveal its granular structure 2) This structure can be reached
through a logic mapping of granules from data set X[1]
Christos N. Zigkolis Aristotle University of Thessaloniki 32
33. Directional Clustering(2)
“Direction except from relation”
Problem Formulation
X[1] data set Standard FCM objective function
X[2] data set We need an obj_func to face the two
main objectives: Relational and Directional
C [2] N
2
Q = ∑ ∑ u [2] X j [2] − proti [2] +
m
ij
i =1 j =1
C [2] N
2
β ∑ ∑ (uij [2] − φi (U [1])) X j [2] − proti [2] 2
i =1 j =1
Christos N. Zigkolis Aristotle University of Thessaloniki 33
34. Directional Clustering(2)
“Direction except from relation”
Problem Formulation (cont…)
• The first term of Q equation is for revealing structure in X[2]
(relational).
• The second term captures the differences between U[2] and
the mapping φ(.) of the structure detected in X[1] (directional).
• The factor β is for keeping a balance between the relational
and directional facets of the optimization
Christos N. Zigkolis Aristotle University of Thessaloniki 34
35. Directional Clustering(3)
“Direction except from relation”
Logic Transformations Between A n’ B information granules
How we formulate THE Mapping – TWO APPROACHES
1. OR-Based Aggregation
Bi = (A1 t wi1) s (A2 t wi2) s…s (AC[1] t wiC[1])
t- and s- norms can be compare to ∪ and ∩ operators
The most common used t-norm is the min() and given the t-norm
we can compute the s-norm via
a s b = 1 − (1 − a ) t (1 − b)
Christos N. Zigkolis Aristotle University of Thessaloniki 35
36. Directional Clustering(4)
“Direction except from relation”
Logic Transformations Between A n’ B information granules
How we formulate THE Mapping – TWO APPROACHES
2. AND-Based Aggregation
Bi = (A1 s wi1) t (A2 s wi2) t…t (AC[1] s wiC[1])
Which approach is the best for use?
Empirically, OR-Based when C[1] > C[2] and
AND-Based when C[1] < C[2]
Christos N. Zigkolis Aristotle University of Thessaloniki 36
37. Directional Clustering(5)
“Direction except from relation”
Examples
• Directional fuzzy clustering and its application to fuzzy modelling
“presentation of the technique and its role in a two-phase fuzzy
identification scheme”
Christos N. Zigkolis Aristotle University of Thessaloniki 37
38. Fuzzy Relational Clustering
“Focusing on pairs of patterns”
FROM
patterns with vector features
TO
relational patterns with degrees of dissimilarity
• N cities distances between pairs of them : dij
Matrix of distances includes the relational patterns
• Compare faces in a pair-wise manner and compute
proximity degrees (relational patterns)
Christos N. Zigkolis Aristotle University of Thessaloniki 38
39. Fuzzy Relational Clustering(2)
“Focusing on pairs of patterns”
FCM for relational data
The input of the algorithm is the dissimilarity matrix Rij which
includes all the degrees of similarity between patterns instead
of original patterns
Similarity Matrix Dij = 1 - Rij
Christos N. Zigkolis Aristotle University of Thessaloniki 39
40. Fuzzy Relational Clustering(3)
“Focusing on pairs of patterns”
Examples
• Low-complexity fuzzy relational clustering algorithms for Web
mining
“new Fuzzy Relational Clustering techniques in Web Mining*:
(1)FCMdd (Fuzzy C Medoids) and (2)RFCMdd (Robust Fuzzy
C Medoids)
Comparison tests with standard RFCM”
*Web document clustering, snippet clustering and Web access log analysis
Christos N. Zigkolis Aristotle University of Thessaloniki 40
41. References
W. Pedrycz, “Knowledge-Based Clustering from Data to
Information Granules”
Fuzzy c-Means Clustering of Incomplete Data
FCM-Based Model Selection Algorithms for Determining the
Number of Clusters
Using CFCM to mine event-related brain dynamics
Partially Supervised Clustering for Image Segmentation
Christos N. Zigkolis Aristotle University of Thessaloniki 41
42. References
Semantic Content Analysis : A Study in Proximity-Based
Collaborative Clustering
Clustering in the framework of collaborative agents
Directional fuzzy clustering and its application to fuzzy modeling
Low-complexity fuzzy relational clustering algorithms for Web
mining
Christos N. Zigkolis Aristotle University of Thessaloniki 42