SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Knowledge Based
              Clustering
An Intelligent way to find groups in your data
Contents
      Knowledge-Based Clustering (KBC)
      Fuzzy Clustering and FCM
      Conditional Fuzzy Clustering and CFCM
      Clustering With Partial Supervision
      Collaborative Clustering
      Directional Clustering
      Fuzzy Relational Clustering

Christos N. Zigkolis   Aristotle University of Thessaloniki   2
Some reasonable questions…
          What type of clustering is the KBC?
                       “Partitional Clustering”

          What are the differences from the
          “conventional” clustering?
                       “Data-Centric VS Human-Centric”

          What are the basic concepts of KBC?
                       “Information Granules, Fuzzy Clustering,
                       Objective Function-Based Techniques”

Christos N. Zigkolis        Aristotle University of Thessaloniki   3
Data Clustering



                                                         Partitional
 Hierarchical
                                                         Clustering – PC
 Clustering – HC
 (Agglomerative HC)

                          Hard Clustering
                                                                           Soft Clustering
                          (K-Means)

  Data-Centric
  Approaches
                                                            Fuzzy Clustering – FC
                                                            (Fuzzy C-Means)

------------------------------------------------------------------
 Knowledge Based Clustering
                              Human-Centric
                              Approaches
   Christos N. Zigkolis       Aristotle University of Thessaloniki                           4
Objective Function-Based
  Clustering Techniques

 minor max(obj_function)                    =>                  better clustering

            To formulate an objective function that
            is capable of reflecting the nature of
Our GOAL is:the problem so that its min() or max()
            reveals a meaningful structure in the
            dataset.

  Christos N. Zigkolis   Aristotle University of Thessaloniki                   5
Fuzzy Clustering
“The Big Bang for KBC”

Binary Character of Partitions

                       0 || 1

                                                 VS
                                          Fuzzy Logic – Partial Membership

                                                                       [0, 1]

Christos N. Zigkolis            Aristotle University of Thessaloniki            6
Fuzzy Clustering (2)
  “The Big Bang for KBC”


          K Means              +    Fuzzy Logic                            =   Fuzzy C Means


“Yet another clustering procedure…What is so
special about it?”   Can deal with patterns with
borderline character contrary to K-Means

                         [prototypes, U] = fcm( X_data, C)


  Christos N. Zigkolis              Aristotle University of Thessaloniki                       7
Christos N. Zigkolis   Aristotle University of Thessaloniki   8
Fuzzy Clustering (3)
  “The Big Bang for KBC”
Input
• X_data [Nxp]
                                                     Iterative Process
• m : fuzzification coefficient >= 1
                                                     1. Compute the prototypes
• C : number of clusters
                                                     2. Compute the U matrix
• initialized U[CxN] matrix
                                                     3. Compute the value of the
                                                        objective function and
Output
                                                        stop the process if this
• prototypes [Cxp]                                      value is lower than a
                                                        criterion e
• U matrix [CxN]
  Christos N. Zigkolis     Aristotle University of Thessaloniki                9
“Stop talking and show us the maths”
                                N

                             ∑  m
                               uij X j
(1)        proti =            j =1
                                              N

                                             ∑  m
                                               uij
                                             j =1
                                                                                           Restriction

(2)        uij =                1                                                                  C

                                                                                                  ∑u
                                                                          2
                                     C
                                            X − proti                                                         =1
                                     ∑(
                                     i =1   X − prot j
                                                                   )   ( m −1)

                                                                                                  i =1
                                                                                                         ij




                       C    N                                2
(3)        Q = ∑∑ uij X j − proti
                   m
                                                                 <e
                       i =1 j =1


Christos N. Zigkolis                                Aristotle University of Thessaloniki                      10
Fuzzy Clustering (4)
  “The Big Bang for KBC”
Examples

• Fuzzy c-Means Clustering of Incomplete Data
“Modified versions of standard FCM are applied for dealing
with data with missing feature values”
• FCM-Based Model Selection Algorithms for Determining the
Number of Clusters
“Determining the number of clusters in a given data set and a
new validity index for measuring the “goodness” of clustering”


  Christos N. Zigkolis   Aristotle University of Thessaloniki   11
Conditional Fuzzy Clustering
 “The presence of the aside information”

                                      FROM
                         UNSUPERVISED LEARNING
                                           TO
                        SEMI-SUPERVISED LEARNING


We mark our patterns according to a condition and these
marks are the aside information which can guide our
clustering process to give more meaningful results.

 Christos N. Zigkolis         Aristotle University of Thessaloniki   12
Conditional Fuzzy Clustering
 “The presence of the aside information”
                         (1)
Xdata [N x p]                  Condition(s) (2)
                                                                      Zk [1 x N] (Patterns’ Marks)
                                                    (3)
                                                        Scaling Function
                                 (4)
Fk [1 x N] (Scaled patterns’ marks)
                                                                    (5)

                        [prototypes, U] = CFCM(Xdata, Fk, C)
 Christos N. Zigkolis                  Aristotle University of Thessaloniki                    13
Conditional Fuzzy Clustering(2)
“The presence of the aside information”

Formulation Differences from FCM


       Restriction
        C                                                Fj
                                   uij =
     ∑ uij = Fj =>
      i =1
                                                              C
                                                                     X − proti
                                                              ∑ ( X − prot       )
                                                                                        2
                                                                                     ( m −1)

                                                              i =1           j




Christos N. Zigkolis   Aristotle University of Thessaloniki                            14
Conditional Fuzzy Clustering(3)
  “The presence of the aside information”
Example

“Using CFCM to mine event-related brain dynamics”
by C.N. Zigkolis and N.A. Laskaris


“…a framework for mining event related dynamics based on Conditional
FCM (CFCM). CFCM enables prototyping in a principled manner. User-
defined constraints, which are imposed by the nature of experimental data
and/or dictated by the neuroscientist’s intuition, direct the process of
knowledge extraction and can robustify single-trial analysis…“


  Christos N. Zigkolis        Aristotle University of Thessaloniki   15
Clustering with Partial Supervision
   “Label some, cluster all”

                          X = [X1, X2, ..., XN]
---------------------------------------------------------------------------

  Labeled patterns                                               Unlabeled patterns
  Υ = [Υ1,..., ΥΜ]                                               Z = [Z1,..., ZN-M]
---------------------------------------------------------------------------
                               '
                            X =Y∪Z
   After labeling some patterns we start the clustering process
   Christos N. Zigkolis       Aristotle University of Thessaloniki                    16
Clustering with Partial Supervision(2)
 “Label some, cluster all”
          How this labeling are going to help us?
• Labeling = Knowledge
• This Knowledge will guide the whole process
• The labeled patterns can be considered as a grid of anchor points with
which we get to the entire structure of the data set

          What algorithmic changes do we need to include this
          partial supervision to the clustering process?
• The knowledge has to be included in the objective function

• The formulation of prototypes and U matrix takes another form

 Christos N. Zigkolis        Aristotle University of Thessaloniki          17
Clustering with Partial Supervision(3)
  “Label some, cluster all”
Problem Formulation
Extra Structures :
• b = [b1, b2, …, bN]   the vector of labels, bi=0|1 indicates if a
pattern is labeled or not.
• F[CxN] = [fij] a partition matrix which contains the membership
values for labeled patterns. The columns that correspond to
unlabeled data have zero values.
•α   nonnegative weight factor for setting up a suitable balance
between the supervised and unsupervised mode of learning

  Christos N. Zigkolis    Aristotle University of Thessaloniki   18
Clustering with Partial Supervision(4)
  “Label some, cluster all”
Problem Formulation (cont..)
         C       N                              C       N
                                 2                                            2
Q = ∑ ∑ u X j − proti + α ∑∑ (uij − f ij ) bk X j − proti
                         m
                         ij
                                                                     2

        i =1 j =1                              i =1 j =1




 The extra term is the augmentation we need. It addresses the
 effect of partial supervision



  Christos N. Zigkolis        Aristotle University of Thessaloniki       19
Clustering with Partial Supervision(5)
 “Label some, cluster all”
 Examples

• Handwritten           Digits                            • Reliance?   of a training set




 Christos N. Zigkolis            Aristotle University of Thessaloniki                  20
Clustering with Partial Supervision(6)
 “Label some, cluster all”
 Real Example
• Partially Supervised Clustering for Image Segmentation

“This paper describes a new method (ssFCM) for
classification. The method is well suited to problems such as
the segmentation of Magnetic Resonance Images (MRI). A
small set of labeled pixels provides a clustering algorithm
with a form of partial supervision”




 Christos N. Zigkolis   Aristotle University of Thessaloniki   21
Collaborative Clustering
“All for one and one for all”

          What if we have to deal with several data sets and we
          are interested in revealing a global structure?
“The concept of collaboration : We process each data set separately
and we have a collaboration by exchanging information about the
individual results”


        Why don’t we put everything in one data set and do our
        job?
“The paradigm of different organizations with different databases. We
don’t have access to others’ sources but we appreciate any external
assistant information”
Christos N. Zigkolis        Aristotle University of Thessaloniki   22
Collaborative Clustering(2)
   “All for one and one for all”
  Horizontal Collaborative Clustering

X[1],X[2],..,X[p] data sets
Same objects but in
different feature spaces
ex. Same patients in
different institute database

 The collaboration / communication platform is based between
 the individual partition matrices

   Christos N. Zigkolis       Aristotle University of Thessaloniki   23
Collaborative Clustering(3)
 “All for one and one for all”
Horizontal Collaborative Clustering

• matrix of Connections : α[ii,jj] >= 0
• the higher the value   the
stronger the collaboration between
subsets
• matrix α is not essentially
symmetric, α[ii, jj] ≠ α[jj, ii]




 Christos N. Zigkolis       Aristotle University of Thessaloniki   24
Collaborative Clustering(4)
     “All for one and one for all”
 Horizontal Collaborative Clustering
 Problem Formulation
                            N      C
                                                                                                           2
Q [ii] =                    ∑ ∑
                            j=1   i=1
                                         u    m
                                              ij     [ii ] X                    j   [ii ] − p r o i[ii ]           +
       p                          N     C
                                                                                                               2
    ∑
 jj =1, jj ≠ ii
                  α [ii, jj ]∑∑ {uij [ii ] − uij [ jj ]} X j [ii ] − proi [ii ]
                                  j =1 i =1
                                                                                          m



The second term makes the clustering based on the iith subset
“aware” of the other partitions. If the structures in data sets are
similar then the differences between U tend to be lower, and the
resulting structure becomes more similar

     Christos N. Zigkolis                          Aristotle University of Thessaloniki                    25
Collaborative Clustering(5)
  “All for one and one for all”
Vertical Collaborative Clustering

X[1],X[2],..,X[p] different data sets
Same feature space, different objects

ex. Auditory evoked responses
3 conditions/datasets (attentive,
stimulation, spontaneous activity)

We have the collaboration /
communication at the level of the
prototypes
  Christos N. Zigkolis   Aristotle University of Thessaloniki   26
Collaborative Clustering(6)
 “All for one and one for all”
Vertical Collaborative Clustering
Problem Formulation
                   N    C
                                                                                   2
Q[ii ] = ∑∑ u [ii ] X j [ii ] − proti [ii ] +
                              m
                              ij
                  j =1 i =1
      p                        N     C
                                                                                       2
    ∑
 jj =1, jj ≠ ii
                  β [ii, jj ]∑∑ u [ii ] proti [ii ] − proti [ jj ]
                               j =1 i =1
                                           m
                                           ij



The second term articulates the differences between the
prototypes

 Christos N. Zigkolis                       Aristotle University of Thessaloniki           27
Collaborative Clustering(7)
    “All for one and one for all”
  The 2 algorithmic Phases of Collaborative clustering

PHASE 1
FCM to each data set        number of clusters have to be the same for all
data sets.
// compute proti[ii], i=1,…,C and U[ii] for all subsets //
PHASE 2
Setting up the collaboration level and reach to an optimization
// compute α[ii, jj] (Horizontal Clust.) or β[ii, jj] (Vertical Clust.) and
optimize the partition matrices //
    Christos N. Zigkolis       Aristotle University of Thessaloniki   28
Collaborative Clustering(8)
 “All for one and one for all”
A combination of Horizontal and Vertical clustering




                                         The Objective Function will be
                                         a combination of the objective
                                         functions from Horizontal and
                                         Vertical Clustering




 Christos N. Zigkolis   Aristotle University of Thessaloniki          29
Collaborative Clustering(9)
 “All for one and one for all”

Consensus Clustering

• Different objects – Same feature space – Lack of interaction
• Clustering in the produced prototypes from each data set =
Meta – Clustering
• Different number of clusters C[1], C[2], …, C[p]
• Building meta-structure – A partition matrix in a higher level
• U at the higher level is formed on the basis of the
prototypes of the data sets

 Christos N. Zigkolis    Aristotle University of Thessaloniki   30
Collaborative Clustering(10)
 “All for one and one for all”
Examples
• Semantic Content Analysis : A Study in Proximity-Based
Collaborative Clustering “clustering semantic web documents
under the collaboration of semantic and data view”

• Clustering in the framework of
collaborative agents
“…a model of collaborative clustering
(horizontal and vertical) realized over a
collection of data sets in which a
computing agent carries out an individual
clustering process”
 Christos N. Zigkolis   Aristotle University of Thessaloniki   31
Directional Clustering
 “Direction except from relation”

X[1] and X[2] different data sets
• Our goal is to form a map
  between the information
  granules developed for these
  two data sets.
• Clustering the data set X[1] is
  the first step. Then cluster the
  data set X[2] under 2 criteria.
1) Reveal its granular structure 2) This structure can be reached
   through a logic mapping of granules from data set X[1]
 Christos N. Zigkolis    Aristotle University of Thessaloniki   32
Directional Clustering(2)
 “Direction except from relation”
Problem Formulation
X[1] data set                Standard FCM objective function
X[2] data set   We need an obj_func to face the two
main objectives: Relational and Directional
        C [2] N
                                                                  2
Q = ∑ ∑ u [2] X j [2] − proti [2] +
                        m
                        ij
         i =1 j =1

             C [2] N
                                                                           2
        β ∑ ∑ (uij [2] − φi (U [1])) X j [2] − proti [2]     2

               i =1 j =1

 Christos N. Zigkolis               Aristotle University of Thessaloniki       33
Directional Clustering(2)
  “Direction except from relation”
Problem Formulation (cont…)


• The first term of Q equation is for revealing structure in X[2]
(relational).
• The second term captures the differences between U[2] and
the mapping φ(.) of the structure detected in X[1] (directional).
• The factor β is for keeping a balance between the relational
and directional facets of the optimization



  Christos N. Zigkolis    Aristotle University of Thessaloniki   34
Directional Clustering(3)
   “Direction except from relation”
Logic Transformations Between A n’ B information granules
 How we formulate THE Mapping – TWO APPROACHES
 1. OR-Based Aggregation

 Bi = (A1 t wi1) s (A2 t wi2) s…s (AC[1] t wiC[1])
 t- and s- norms can be compare to ∪ and ∩ operators

The most common used t-norm is the min() and given the t-norm
we can compute the s-norm via
                          a s b = 1 − (1 − a ) t (1 − b)
   Christos N. Zigkolis           Aristotle University of Thessaloniki   35
Directional Clustering(4)
  “Direction except from relation”
Logic Transformations Between A n’ B information granules
 How we formulate THE Mapping – TWO APPROACHES
 2. AND-Based Aggregation

 Bi = (A1 s wi1) t (A2 s wi2) t…t (AC[1] s wiC[1])
            Which approach is the best for use?
            Empirically, OR-Based when C[1] > C[2] and
            AND-Based when C[1] < C[2]


  Christos N. Zigkolis       Aristotle University of Thessaloniki   36
Directional Clustering(5)
   “Direction except from relation”

  Examples

• Directional fuzzy clustering and its application to fuzzy modelling
“presentation of the technique and its role in a two-phase fuzzy
identification scheme”




   Christos N. Zigkolis    Aristotle University of Thessaloniki   37
Fuzzy Relational Clustering
“Focusing on pairs of patterns”
                                        FROM
                        patterns with vector features
                                             TO
               relational patterns with degrees of dissimilarity

• N cities   distances between pairs of them : dij
Matrix of distances includes the relational patterns
• Compare faces in a pair-wise manner and compute
proximity degrees (relational patterns)

Christos N. Zigkolis            Aristotle University of Thessaloniki   38
Fuzzy Relational Clustering(2)
 “Focusing on pairs of patterns”
FCM for relational data
The input of the algorithm is the dissimilarity matrix Rij which
includes all the degrees of similarity between patterns instead
of original patterns
Similarity Matrix Dij = 1 - Rij




 Christos N. Zigkolis      Aristotle University of Thessaloniki   39
Fuzzy Relational Clustering(3)
 “Focusing on pairs of patterns”
Examples
• Low-complexity fuzzy relational clustering algorithms for Web
mining
“new Fuzzy Relational Clustering techniques in Web Mining*:
(1)FCMdd (Fuzzy C Medoids) and (2)RFCMdd (Robust Fuzzy
C Medoids)
Comparison tests with standard RFCM”

*Web document clustering, snippet clustering and Web access log analysis

 Christos N. Zigkolis       Aristotle University of Thessaloniki       40
References
W. Pedrycz, “Knowledge-Based Clustering from Data to
Information Granules”

Fuzzy c-Means Clustering of Incomplete Data

FCM-Based Model Selection Algorithms for Determining the
Number of Clusters

Using CFCM to mine event-related brain dynamics

Partially Supervised Clustering for Image Segmentation
Christos N. Zigkolis   Aristotle University of Thessaloniki   41
References
Semantic Content Analysis : A Study in Proximity-Based
Collaborative Clustering

Clustering in the framework of collaborative agents

Directional fuzzy clustering and its application to fuzzy modeling

Low-complexity fuzzy relational clustering algorithms for Web
mining



Christos N. Zigkolis    Aristotle University of Thessaloniki   42

Weitere ähnliche Inhalte

Was ist angesagt?

GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
Steganographic Scheme Based on Message-Cover matching
Steganographic Scheme Based on Message-Cover matchingSteganographic Scheme Based on Message-Cover matching
Steganographic Scheme Based on Message-Cover matchingIJECEIAES
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded ImaginationDeep Learning JP
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...guestd436758
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machinesMostafa G. M. Mostafa
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...zukun
 
Lecture11
Lecture11Lecture11
Lecture11Bo Li
 
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...Tatsuya Yokota
 
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...Alexander Litvinenko
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech Republic
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech RepublicDemand Modelling of Asymmetric Digital Subscriber Line in the Czech Republic
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech RepublicIDES Editor
 

Was ist angesagt? (14)

GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
Steganographic Scheme Based on Message-Cover matching
Steganographic Scheme Based on Message-Cover matchingSteganographic Scheme Based on Message-Cover matching
Steganographic Scheme Based on Message-Cover matching
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...05210401  P R O B A B I L I T Y  T H E O R Y  A N D  S T O C H A S T I C  P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machines
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
 
Lecture11
Lecture11Lecture11
Lecture11
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
 
Journal_IJABME
Journal_IJABMEJournal_IJABME
Journal_IJABME
 
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...Developing fast  low-rank tensor methods for solving PDEs with uncertain coef...
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
 
Ht3613671371
Ht3613671371Ht3613671371
Ht3613671371
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech Republic
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech RepublicDemand Modelling of Asymmetric Digital Subscriber Line in the Czech Republic
Demand Modelling of Asymmetric Digital Subscriber Line in the Czech Republic
 

Andere mochten auch

Ashish Nagar Gujjar
Ashish Nagar GujjarAshish Nagar Gujjar
Ashish Nagar GujjarAshish Nagar
 
Rachel Krueger's Resume
Rachel Krueger's ResumeRachel Krueger's Resume
Rachel Krueger's ResumeRACHEL KRUEGER
 
Web 1.0 y web 2.0
Web 1.0 y web 2.0Web 1.0 y web 2.0
Web 1.0 y web 2.0Brigiths
 
03yun lingwong-110303024526-phpapp01
03yun lingwong-110303024526-phpapp0103yun lingwong-110303024526-phpapp01
03yun lingwong-110303024526-phpapp013GDR
 
PI System Экспресс (декабрь, 2015)
PI System Экспресс (декабрь, 2015)PI System Экспресс (декабрь, 2015)
PI System Экспресс (декабрь, 2015)Elizaveta Fateeva
 
Endang supriyadi-transparansi-dan-akuntabilitas-perizinan
Endang supriyadi-transparansi-dan-akuntabilitas-perizinanEndang supriyadi-transparansi-dan-akuntabilitas-perizinan
Endang supriyadi-transparansi-dan-akuntabilitas-perizinanAksi SETAPAK
 
Opacidades focales revisado
Opacidades focales revisadoOpacidades focales revisado
Opacidades focales revisadoJose Martinez
 
Manual de urgencias pediatricas
Manual de urgencias pediatricasManual de urgencias pediatricas
Manual de urgencias pediatricasLuzz Tadeo
 
Acollida Curs 2008 2009, 20 de juny de 2008
Acollida Curs 2008 2009, 20 de juny de 2008Acollida Curs 2008 2009, 20 de juny de 2008
Acollida Curs 2008 2009, 20 de juny de 2008ampaescolasantaanna
 

Andere mochten auch (12)

Ashish Nagar Gujjar
Ashish Nagar GujjarAshish Nagar Gujjar
Ashish Nagar Gujjar
 
Rachel Krueger's Resume
Rachel Krueger's ResumeRachel Krueger's Resume
Rachel Krueger's Resume
 
Web 1.0 y web 2.0
Web 1.0 y web 2.0Web 1.0 y web 2.0
Web 1.0 y web 2.0
 
03yun lingwong-110303024526-phpapp01
03yun lingwong-110303024526-phpapp0103yun lingwong-110303024526-phpapp01
03yun lingwong-110303024526-phpapp01
 
Club Level
Club LevelClub Level
Club Level
 
Matchday Hospitality
Matchday HospitalityMatchday Hospitality
Matchday Hospitality
 
RESUME paul
RESUME paulRESUME paul
RESUME paul
 
PI System Экспресс (декабрь, 2015)
PI System Экспресс (декабрь, 2015)PI System Экспресс (декабрь, 2015)
PI System Экспресс (декабрь, 2015)
 
Endang supriyadi-transparansi-dan-akuntabilitas-perizinan
Endang supriyadi-transparansi-dan-akuntabilitas-perizinanEndang supriyadi-transparansi-dan-akuntabilitas-perizinan
Endang supriyadi-transparansi-dan-akuntabilitas-perizinan
 
Opacidades focales revisado
Opacidades focales revisadoOpacidades focales revisado
Opacidades focales revisado
 
Manual de urgencias pediatricas
Manual de urgencias pediatricasManual de urgencias pediatricas
Manual de urgencias pediatricas
 
Acollida Curs 2008 2009, 20 de juny de 2008
Acollida Curs 2008 2009, 20 de juny de 2008Acollida Curs 2008 2009, 20 de juny de 2008
Acollida Curs 2008 2009, 20 de juny de 2008
 

Ähnlich wie Knowledge Based Clustering

Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Machine learning (7)
Machine learning (7)Machine learning (7)
Machine learning (7)NYversity
 
Cs229 notes7a
Cs229 notes7aCs229 notes7a
Cs229 notes7aVuTran231
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIWanjin Yu
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compressionAlexander Decker
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compressionAlexander Decker
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
Complex and Social Network Analysis in Python
Complex and Social Network Analysis in PythonComplex and Social Network Analysis in Python
Complex and Social Network Analysis in Pythonrik0
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Salah Amean
 

Ähnlich wie Knowledge Based Clustering (12)

Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Machine learning (7)
Machine learning (7)Machine learning (7)
Machine learning (7)
 
Cs229 notes7a
Cs229 notes7aCs229 notes7a
Cs229 notes7a
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet III
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Complex and Social Network Analysis in Python
Complex and Social Network Analysis in PythonComplex and Social Network Analysis in Python
Complex and Social Network Analysis in Python
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
 

Kürzlich hochgeladen

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 

Kürzlich hochgeladen (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

Knowledge Based Clustering

  • 1. Knowledge Based Clustering An Intelligent way to find groups in your data
  • 2. Contents Knowledge-Based Clustering (KBC) Fuzzy Clustering and FCM Conditional Fuzzy Clustering and CFCM Clustering With Partial Supervision Collaborative Clustering Directional Clustering Fuzzy Relational Clustering Christos N. Zigkolis Aristotle University of Thessaloniki 2
  • 3. Some reasonable questions… What type of clustering is the KBC? “Partitional Clustering” What are the differences from the “conventional” clustering? “Data-Centric VS Human-Centric” What are the basic concepts of KBC? “Information Granules, Fuzzy Clustering, Objective Function-Based Techniques” Christos N. Zigkolis Aristotle University of Thessaloniki 3
  • 4. Data Clustering Partitional Hierarchical Clustering – PC Clustering – HC (Agglomerative HC) Hard Clustering Soft Clustering (K-Means) Data-Centric Approaches Fuzzy Clustering – FC (Fuzzy C-Means) ------------------------------------------------------------------ Knowledge Based Clustering Human-Centric Approaches Christos N. Zigkolis Aristotle University of Thessaloniki 4
  • 5. Objective Function-Based Clustering Techniques minor max(obj_function) => better clustering To formulate an objective function that is capable of reflecting the nature of Our GOAL is:the problem so that its min() or max() reveals a meaningful structure in the dataset. Christos N. Zigkolis Aristotle University of Thessaloniki 5
  • 6. Fuzzy Clustering “The Big Bang for KBC” Binary Character of Partitions 0 || 1 VS Fuzzy Logic – Partial Membership [0, 1] Christos N. Zigkolis Aristotle University of Thessaloniki 6
  • 7. Fuzzy Clustering (2) “The Big Bang for KBC” K Means + Fuzzy Logic = Fuzzy C Means “Yet another clustering procedure…What is so special about it?” Can deal with patterns with borderline character contrary to K-Means [prototypes, U] = fcm( X_data, C) Christos N. Zigkolis Aristotle University of Thessaloniki 7
  • 8. Christos N. Zigkolis Aristotle University of Thessaloniki 8
  • 9. Fuzzy Clustering (3) “The Big Bang for KBC” Input • X_data [Nxp] Iterative Process • m : fuzzification coefficient >= 1 1. Compute the prototypes • C : number of clusters 2. Compute the U matrix • initialized U[CxN] matrix 3. Compute the value of the objective function and Output stop the process if this • prototypes [Cxp] value is lower than a criterion e • U matrix [CxN] Christos N. Zigkolis Aristotle University of Thessaloniki 9
  • 10. “Stop talking and show us the maths” N ∑ m uij X j (1) proti = j =1 N ∑ m uij j =1 Restriction (2) uij = 1 C ∑u 2 C X − proti =1 ∑( i =1 X − prot j ) ( m −1) i =1 ij C N 2 (3) Q = ∑∑ uij X j − proti m <e i =1 j =1 Christos N. Zigkolis Aristotle University of Thessaloniki 10
  • 11. Fuzzy Clustering (4) “The Big Bang for KBC” Examples • Fuzzy c-Means Clustering of Incomplete Data “Modified versions of standard FCM are applied for dealing with data with missing feature values” • FCM-Based Model Selection Algorithms for Determining the Number of Clusters “Determining the number of clusters in a given data set and a new validity index for measuring the “goodness” of clustering” Christos N. Zigkolis Aristotle University of Thessaloniki 11
  • 12. Conditional Fuzzy Clustering “The presence of the aside information” FROM UNSUPERVISED LEARNING TO SEMI-SUPERVISED LEARNING We mark our patterns according to a condition and these marks are the aside information which can guide our clustering process to give more meaningful results. Christos N. Zigkolis Aristotle University of Thessaloniki 12
  • 13. Conditional Fuzzy Clustering “The presence of the aside information” (1) Xdata [N x p] Condition(s) (2) Zk [1 x N] (Patterns’ Marks) (3) Scaling Function (4) Fk [1 x N] (Scaled patterns’ marks) (5) [prototypes, U] = CFCM(Xdata, Fk, C) Christos N. Zigkolis Aristotle University of Thessaloniki 13
  • 14. Conditional Fuzzy Clustering(2) “The presence of the aside information” Formulation Differences from FCM Restriction C Fj uij = ∑ uij = Fj => i =1 C X − proti ∑ ( X − prot ) 2 ( m −1) i =1 j Christos N. Zigkolis Aristotle University of Thessaloniki 14
  • 15. Conditional Fuzzy Clustering(3) “The presence of the aside information” Example “Using CFCM to mine event-related brain dynamics” by C.N. Zigkolis and N.A. Laskaris “…a framework for mining event related dynamics based on Conditional FCM (CFCM). CFCM enables prototyping in a principled manner. User- defined constraints, which are imposed by the nature of experimental data and/or dictated by the neuroscientist’s intuition, direct the process of knowledge extraction and can robustify single-trial analysis…“ Christos N. Zigkolis Aristotle University of Thessaloniki 15
  • 16. Clustering with Partial Supervision “Label some, cluster all” X = [X1, X2, ..., XN] --------------------------------------------------------------------------- Labeled patterns Unlabeled patterns Υ = [Υ1,..., ΥΜ] Z = [Z1,..., ZN-M] --------------------------------------------------------------------------- ' X =Y∪Z After labeling some patterns we start the clustering process Christos N. Zigkolis Aristotle University of Thessaloniki 16
  • 17. Clustering with Partial Supervision(2) “Label some, cluster all” How this labeling are going to help us? • Labeling = Knowledge • This Knowledge will guide the whole process • The labeled patterns can be considered as a grid of anchor points with which we get to the entire structure of the data set What algorithmic changes do we need to include this partial supervision to the clustering process? • The knowledge has to be included in the objective function • The formulation of prototypes and U matrix takes another form Christos N. Zigkolis Aristotle University of Thessaloniki 17
  • 18. Clustering with Partial Supervision(3) “Label some, cluster all” Problem Formulation Extra Structures : • b = [b1, b2, …, bN] the vector of labels, bi=0|1 indicates if a pattern is labeled or not. • F[CxN] = [fij] a partition matrix which contains the membership values for labeled patterns. The columns that correspond to unlabeled data have zero values. •α nonnegative weight factor for setting up a suitable balance between the supervised and unsupervised mode of learning Christos N. Zigkolis Aristotle University of Thessaloniki 18
  • 19. Clustering with Partial Supervision(4) “Label some, cluster all” Problem Formulation (cont..) C N C N 2 2 Q = ∑ ∑ u X j − proti + α ∑∑ (uij − f ij ) bk X j − proti m ij 2 i =1 j =1 i =1 j =1 The extra term is the augmentation we need. It addresses the effect of partial supervision Christos N. Zigkolis Aristotle University of Thessaloniki 19
  • 20. Clustering with Partial Supervision(5) “Label some, cluster all” Examples • Handwritten Digits • Reliance? of a training set Christos N. Zigkolis Aristotle University of Thessaloniki 20
  • 21. Clustering with Partial Supervision(6) “Label some, cluster all” Real Example • Partially Supervised Clustering for Image Segmentation “This paper describes a new method (ssFCM) for classification. The method is well suited to problems such as the segmentation of Magnetic Resonance Images (MRI). A small set of labeled pixels provides a clustering algorithm with a form of partial supervision” Christos N. Zigkolis Aristotle University of Thessaloniki 21
  • 22. Collaborative Clustering “All for one and one for all” What if we have to deal with several data sets and we are interested in revealing a global structure? “The concept of collaboration : We process each data set separately and we have a collaboration by exchanging information about the individual results” Why don’t we put everything in one data set and do our job? “The paradigm of different organizations with different databases. We don’t have access to others’ sources but we appreciate any external assistant information” Christos N. Zigkolis Aristotle University of Thessaloniki 22
  • 23. Collaborative Clustering(2) “All for one and one for all” Horizontal Collaborative Clustering X[1],X[2],..,X[p] data sets Same objects but in different feature spaces ex. Same patients in different institute database The collaboration / communication platform is based between the individual partition matrices Christos N. Zigkolis Aristotle University of Thessaloniki 23
  • 24. Collaborative Clustering(3) “All for one and one for all” Horizontal Collaborative Clustering • matrix of Connections : α[ii,jj] >= 0 • the higher the value the stronger the collaboration between subsets • matrix α is not essentially symmetric, α[ii, jj] ≠ α[jj, ii] Christos N. Zigkolis Aristotle University of Thessaloniki 24
  • 25. Collaborative Clustering(4) “All for one and one for all” Horizontal Collaborative Clustering Problem Formulation N C 2 Q [ii] = ∑ ∑ j=1 i=1 u m ij [ii ] X j [ii ] − p r o i[ii ] + p N C 2 ∑ jj =1, jj ≠ ii α [ii, jj ]∑∑ {uij [ii ] − uij [ jj ]} X j [ii ] − proi [ii ] j =1 i =1 m The second term makes the clustering based on the iith subset “aware” of the other partitions. If the structures in data sets are similar then the differences between U tend to be lower, and the resulting structure becomes more similar Christos N. Zigkolis Aristotle University of Thessaloniki 25
  • 26. Collaborative Clustering(5) “All for one and one for all” Vertical Collaborative Clustering X[1],X[2],..,X[p] different data sets Same feature space, different objects ex. Auditory evoked responses 3 conditions/datasets (attentive, stimulation, spontaneous activity) We have the collaboration / communication at the level of the prototypes Christos N. Zigkolis Aristotle University of Thessaloniki 26
  • 27. Collaborative Clustering(6) “All for one and one for all” Vertical Collaborative Clustering Problem Formulation N C 2 Q[ii ] = ∑∑ u [ii ] X j [ii ] − proti [ii ] + m ij j =1 i =1 p N C 2 ∑ jj =1, jj ≠ ii β [ii, jj ]∑∑ u [ii ] proti [ii ] − proti [ jj ] j =1 i =1 m ij The second term articulates the differences between the prototypes Christos N. Zigkolis Aristotle University of Thessaloniki 27
  • 28. Collaborative Clustering(7) “All for one and one for all” The 2 algorithmic Phases of Collaborative clustering PHASE 1 FCM to each data set number of clusters have to be the same for all data sets. // compute proti[ii], i=1,…,C and U[ii] for all subsets // PHASE 2 Setting up the collaboration level and reach to an optimization // compute α[ii, jj] (Horizontal Clust.) or β[ii, jj] (Vertical Clust.) and optimize the partition matrices // Christos N. Zigkolis Aristotle University of Thessaloniki 28
  • 29. Collaborative Clustering(8) “All for one and one for all” A combination of Horizontal and Vertical clustering The Objective Function will be a combination of the objective functions from Horizontal and Vertical Clustering Christos N. Zigkolis Aristotle University of Thessaloniki 29
  • 30. Collaborative Clustering(9) “All for one and one for all” Consensus Clustering • Different objects – Same feature space – Lack of interaction • Clustering in the produced prototypes from each data set = Meta – Clustering • Different number of clusters C[1], C[2], …, C[p] • Building meta-structure – A partition matrix in a higher level • U at the higher level is formed on the basis of the prototypes of the data sets Christos N. Zigkolis Aristotle University of Thessaloniki 30
  • 31. Collaborative Clustering(10) “All for one and one for all” Examples • Semantic Content Analysis : A Study in Proximity-Based Collaborative Clustering “clustering semantic web documents under the collaboration of semantic and data view” • Clustering in the framework of collaborative agents “…a model of collaborative clustering (horizontal and vertical) realized over a collection of data sets in which a computing agent carries out an individual clustering process” Christos N. Zigkolis Aristotle University of Thessaloniki 31
  • 32. Directional Clustering “Direction except from relation” X[1] and X[2] different data sets • Our goal is to form a map between the information granules developed for these two data sets. • Clustering the data set X[1] is the first step. Then cluster the data set X[2] under 2 criteria. 1) Reveal its granular structure 2) This structure can be reached through a logic mapping of granules from data set X[1] Christos N. Zigkolis Aristotle University of Thessaloniki 32
  • 33. Directional Clustering(2) “Direction except from relation” Problem Formulation X[1] data set Standard FCM objective function X[2] data set We need an obj_func to face the two main objectives: Relational and Directional C [2] N 2 Q = ∑ ∑ u [2] X j [2] − proti [2] + m ij i =1 j =1 C [2] N 2 β ∑ ∑ (uij [2] − φi (U [1])) X j [2] − proti [2] 2 i =1 j =1 Christos N. Zigkolis Aristotle University of Thessaloniki 33
  • 34. Directional Clustering(2) “Direction except from relation” Problem Formulation (cont…) • The first term of Q equation is for revealing structure in X[2] (relational). • The second term captures the differences between U[2] and the mapping φ(.) of the structure detected in X[1] (directional). • The factor β is for keeping a balance between the relational and directional facets of the optimization Christos N. Zigkolis Aristotle University of Thessaloniki 34
  • 35. Directional Clustering(3) “Direction except from relation” Logic Transformations Between A n’ B information granules How we formulate THE Mapping – TWO APPROACHES 1. OR-Based Aggregation Bi = (A1 t wi1) s (A2 t wi2) s…s (AC[1] t wiC[1]) t- and s- norms can be compare to ∪ and ∩ operators The most common used t-norm is the min() and given the t-norm we can compute the s-norm via a s b = 1 − (1 − a ) t (1 − b) Christos N. Zigkolis Aristotle University of Thessaloniki 35
  • 36. Directional Clustering(4) “Direction except from relation” Logic Transformations Between A n’ B information granules How we formulate THE Mapping – TWO APPROACHES 2. AND-Based Aggregation Bi = (A1 s wi1) t (A2 s wi2) t…t (AC[1] s wiC[1]) Which approach is the best for use? Empirically, OR-Based when C[1] > C[2] and AND-Based when C[1] < C[2] Christos N. Zigkolis Aristotle University of Thessaloniki 36
  • 37. Directional Clustering(5) “Direction except from relation” Examples • Directional fuzzy clustering and its application to fuzzy modelling “presentation of the technique and its role in a two-phase fuzzy identification scheme” Christos N. Zigkolis Aristotle University of Thessaloniki 37
  • 38. Fuzzy Relational Clustering “Focusing on pairs of patterns” FROM patterns with vector features TO relational patterns with degrees of dissimilarity • N cities distances between pairs of them : dij Matrix of distances includes the relational patterns • Compare faces in a pair-wise manner and compute proximity degrees (relational patterns) Christos N. Zigkolis Aristotle University of Thessaloniki 38
  • 39. Fuzzy Relational Clustering(2) “Focusing on pairs of patterns” FCM for relational data The input of the algorithm is the dissimilarity matrix Rij which includes all the degrees of similarity between patterns instead of original patterns Similarity Matrix Dij = 1 - Rij Christos N. Zigkolis Aristotle University of Thessaloniki 39
  • 40. Fuzzy Relational Clustering(3) “Focusing on pairs of patterns” Examples • Low-complexity fuzzy relational clustering algorithms for Web mining “new Fuzzy Relational Clustering techniques in Web Mining*: (1)FCMdd (Fuzzy C Medoids) and (2)RFCMdd (Robust Fuzzy C Medoids) Comparison tests with standard RFCM” *Web document clustering, snippet clustering and Web access log analysis Christos N. Zigkolis Aristotle University of Thessaloniki 40
  • 41. References W. Pedrycz, “Knowledge-Based Clustering from Data to Information Granules” Fuzzy c-Means Clustering of Incomplete Data FCM-Based Model Selection Algorithms for Determining the Number of Clusters Using CFCM to mine event-related brain dynamics Partially Supervised Clustering for Image Segmentation Christos N. Zigkolis Aristotle University of Thessaloniki 41
  • 42. References Semantic Content Analysis : A Study in Proximity-Based Collaborative Clustering Clustering in the framework of collaborative agents Directional fuzzy clustering and its application to fuzzy modeling Low-complexity fuzzy relational clustering algorithms for Web mining Christos N. Zigkolis Aristotle University of Thessaloniki 42