SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Multi-Objective Optimization for Clustering of
Medical Publications
Asif Ekbal1

Sriparna Saha1

India Institute of Technology1
Patna, Bihar, India

Diego Moll´2
a

K Ravikumar1

Centre for Language Technology2
Macquarie University
Sydney, Australia

ALTA 2013, Brisbane, Australia
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Contents

Clustering for Evidence Based Medicine
Clustering as a MOO Problem
AMOSA-clus
Results

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

2/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Contents

Clustering for Evidence Based Medicine
Clustering as a MOO Problem
AMOSA-clus
Results

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

3/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Evidence Based Medicine

http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

4/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

The Dream

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

5/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

The Bottom-line Answer

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

6/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

A Means of Getting There
Output
Input
QUESTION:
Which treatments
work best for
hemorrhoids?
DOCUMENTS:
[11289288]
[12972967]
[1442682]
[15486746]
[16235372]
[16252313]
[17054255]
[17380367]

clustering

=⇒
summarisation

1. Excision is the most effective
treatment for thrombosed
external hemorrhoids.
[11289288] [12972967]
[15486746]
2. For prolapsed internal
hemorrhoids, the best
definitive treatment is
traditional hemorrhoidectomy.
[17054255] [17380367]
3. Of nonoperative techniques,
rubber band ligation produces
the lowest rate of recurrence.
[1442682] [16252313]
[16235372]

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

7/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

This Work

Each question is formulated as an independent clustering task.

Input

Output

QUESTION:
Which treatments work
best for hemorrhoids?
DOCUMENTS:
[11289288] [12972967]
[1442682] [15486746]
[16235372] [16252313]
[17054255] [17380367]

clustering

=⇒

MOO for Medical Clustering

1. [11289288] [12972967]
[15486746]
2. [17054255] [17380367]
3. [1442682] [16252313]
[16235372]

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

8/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Related Work

Uses of Document Clustering

Clustering in EBM

Web search

Cluster search results

Topic detection and
tracking

Cluster based on
interventions

Training data expansion

Shash & Molla (2013):
k-means clustering on our
data set

Multi-document
summarisation

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

9/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Contents

Clustering for Evidence Based Medicine
Clustering as a MOO Problem
AMOSA-clus
Results

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

10/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Clustering and Multi-Objective Optimization
Most existing clustering techniques are based on a single
criterion of goodness.
Several criteria of goodness have been proposed.
So why not try several criteria at once?

Internal Validity
External Validity

BIC-index
CH-index

Minkowski scores

Silhouette-index

F-measures

DB-index

...

...
MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

11/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Information in Internal Validity Indices

Compactness
Measures the distance among the various elements of the
cluster.
We want clusters with short distances between its elements.

Separability
Measures the distance between clusters.
We want relatively large distances between clusters.

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

12/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

I -Index (Maulik & Bandyopadhyay, 2002)
I (K ) = (
K
EK
DK
cj
xk
j
nk
E1
EK

=
=
=
=
=
=

1
E1
×
× DK )p
K
EK

number of clusters
nk
K
k
k=1
j=1 de (c k , x j )
K
maxi,j=1 de (c i , c j )
centroid of the jth cluster
jth point of the kth cluster
total number of points present in the kth cluster

increases I as the clusters become more compact.

DK increases I as the separation between clusters increase.
(p is a parameter set to 2 in this paper)
MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

13/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

XB-Index (Xie & Beni, 1991)

XB(K ) =
K
cj
xk
j
n
[uij ]K ×n

=
=
=
=
=

K
i=1

n
2
j=1 uij

xj − ci

n(mini=k c i − c k

2

2)

number of clusters
centroid of the jth cluster
jth point of the kth cluster
total number of points present in the dataset
cluster membership matrix

The numerator quantifies the compactness of the clusters.
The denominator quantifies the separation between clusters.

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

14/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

MOO: The Pareto Optimal Front
f2(minimize)

2
4
1

5
3

f1(maximize)

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

15/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Contents

Clustering for Evidence Based Medicine
Clustering as a MOO Problem
AMOSA-clus
Results

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

16/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

String Representation
AMOSA-clus implements simulated annealing (SA).
Centroid-based real-encoding:
Each member of the archive is encoded as a string that
represents the centroids of the partitions.
Each centroid is indivisible.

Given a fixed maximum number of clusters Kmax , the initial
number of centroids and their centroids are determined
randomly.

< 12.3 1.4 22.1 0.01 0.0 15.3 10.2 7.5 >
Represents four cluster centroids:
(12.3, 1.4), (22.1, 0.01), (0.0, 15.3), (10.2, 7.5)
MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

17/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Assignment of Points to the Clusters
Assignment of points and update of cluster centroids resembles an
iteration of the K -means clustering algorithm.
1. A point j is assigned to the cluster k whose centroid has the
minimum distance to j:
k = argmini=1,...K d(x j , c i )

(1)

2. After all points are assigned to a cluster, the cluster centroids
are updated:
ci

=

MOO for Medical Clustering

ni
i
j=1 (x j )

ni

, 1≤i ≤K

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

(2)

18/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Search Operators
Mutation 1 Perturb the centroids of a random cluster using a
Laplacian distribution:
p( ) ∝ e −

| −µ|
δ

Mutation 2 Delete a random cluster centroid.
Mutation 3 Add a new cluster centroid.

< 3.5 1.5 2.1 4.9 1.6 1.2 >
1. If we choose centroid 2, then update centroid (2.1, 4.9). The
new string is: < 3.5 1.5 1.2 3.6 1.6 1.2 >
2. If we choose centroid 3, the new string will be:
< 3.5 1.5 2.1 4.9 >.
3. New string: < 3.5 1.5

2.1 4.9

MOO for Medical Clustering

1.6 1.2

9.7 2.5 >

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

19/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Selecting a Solution
The algorithm produces a set of alternative solutions.
Each solution is optimal according to some criteria.

Unsupervised Setting

Semi-supervised Setting

Choose one solution randomly.
f2(minimize)

2

Select the solution with
best entropy in known
assignments.

4
1

Each question has a
portion of known
clustering assignments.

5
3

f1(maximize)

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

20/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Contents

Clustering for Evidence Based Medicine
Clustering as a MOO Problem
AMOSA-clus
Results

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

21/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Data
Clinical Inquiries from the Journal of Family Practice.
276 clinical questions (276 clustering tasks).
Each question has an average of 5.89 documents.

Which treatments work best for hemorrhoids?
1. Excision is the most effective treatment for thrombosed external
hemorrhoids. [11289288] [12972967] [15486746]
2. For prolapsed internal hemorrhoids, the best definitive treatment is
traditional hemorrhoidectomy. [17054255] [17380367]
3. Of nonoperative techniques, rubber band ligation produces the
lowest rate of recurrence. [1442682] [16252313] [16235372]

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

22/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Results

Distance
Measure

AMOSA-clus1

AMOSA-clus2

best

average

best

average

K-means
(baseline)

Euclidean
Cosine

0.190
0.187

0.249
0.231

0.177
0.177

0.235
0.230

0.240
0.237

Unsupervised: Average solution is slightly better than baseline
(differences statistically significant).
Semi-supervised: Best solution is clearly better than baseline
(differences statistically significant).

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

23/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Finding the Number of Clusters
Distance
Measure

AMOSA-clus1

AMOSA-clus2

best

average

best

average

K-means
(baseline)

Euclidean
Cosine

0.190
0.187

0.249
0.231

0.177
0.177

0.235
0.230

0.240
0.237

AMOSA-clus1: Number of clusters as given by the original data.
Average 2.38 clusters.
AMOSA-clus2: Try several numbers of clusters and select the
solution that optimises I -index and XB-index.
Euclidean distance: Average 2.34 clusters.
Cosine distance: Average 2.51 clusters.
MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

24/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Finding the Number of Clusters

error =

− predictedi )2
# of questions

i (targeti

Method

Error

AMOSA-clus2 Cosine
AMOSA-clus2 Euclidean
k=1
k=2
k=3
k=4
Rule of Thumb
Cover

1.90
1.91
3.91
2.14
2.38
4.61
2.56
1.98

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

25/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Conclusions
Conclusions
Unsupervised setting: slight improvement over k-means baseline.
Semi-supervised setting: clear improvement over k-means baseline.
Number of clusters: better than standard methods.

Further Work
Test on other domains.
Test using other cluster validity indices.
Compare with other semi-supervised methods.

MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

26/26
Clustering for Evidence Based Medicine

Clustering as a MOO Problem

AMOSA-clus

Results

Conclusions
Conclusions
Unsupervised setting: slight improvement over k-means baseline.
Semi-supervised setting: clear improvement over k-means baseline.
Number of clusters: better than standard methods.

Further Work
Test on other domains.
Test using other cluster validity indices.
Compare with other semi-supervised methods.

Questions?
MOO for Medical Clustering

Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar
a

26/26

Weitere ähnliche Inhalte

Was ist angesagt?

COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKE
COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKECOST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKE
COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKEDr. Raghavendra Kumar Gunda
 
76201913
7620191376201913
76201913IJRAT
 
Improve The Performance of K-means by using Genetic Algorithm for Classificat...
Improve The Performance of K-means by using Genetic Algorithm for Classificat...Improve The Performance of K-means by using Genetic Algorithm for Classificat...
Improve The Performance of K-means by using Genetic Algorithm for Classificat...IJECEIAES
 
Randall Ellis: Risk-based comprehensive payment for primary care
Randall Ellis: Risk-based comprehensive payment for primary careRandall Ellis: Risk-based comprehensive payment for primary care
Randall Ellis: Risk-based comprehensive payment for primary careNuffield Trust
 
Advanced statistical manual part ii
Advanced statistical manual part iiAdvanced statistical manual part ii
Advanced statistical manual part iiAyurdata
 
Nephelometry instrumentation presentation
Nephelometry instrumentation presentationNephelometry instrumentation presentation
Nephelometry instrumentation presentationAnneka Pierzga
 
Health informationexchangeacrossus healthinstitution (1)
Health informationexchangeacrossus healthinstitution (1)Health informationexchangeacrossus healthinstitution (1)
Health informationexchangeacrossus healthinstitution (1)University of Illinois,Chicago
 
Clinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseClinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseSunil Kakade
 
Intranasal Delivery of Oximes to Organophosphate exposed Rats
Intranasal Delivery of Oximes to Organophosphate exposed RatsIntranasal Delivery of Oximes to Organophosphate exposed Rats
Intranasal Delivery of Oximes to Organophosphate exposed RatsJordan Horrocks
 
Heart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning AlgorithmHeart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning Algorithmijtsrd
 
Hybrid Technique for Associative Classification of Heart Diseases
Hybrid Technique for Associative Classification of Heart DiseasesHybrid Technique for Associative Classification of Heart Diseases
Hybrid Technique for Associative Classification of Heart DiseasesJagdeep Singh Malhi
 
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
Applying Genetic Algorithms to Information Retrieval Using Vector Space ModelApplying Genetic Algorithms to Information Retrieval Using Vector Space Model
Applying Genetic Algorithms to Information Retrieval Using Vector Space ModelIJCSEA Journal
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONIJDKP
 

Was ist angesagt? (13)

COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKE
COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKECOST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKE
COST-EFFECTIVENESS ANALYSIS IN THE MANAGEMENT OF STROKE
 
76201913
7620191376201913
76201913
 
Improve The Performance of K-means by using Genetic Algorithm for Classificat...
Improve The Performance of K-means by using Genetic Algorithm for Classificat...Improve The Performance of K-means by using Genetic Algorithm for Classificat...
Improve The Performance of K-means by using Genetic Algorithm for Classificat...
 
Randall Ellis: Risk-based comprehensive payment for primary care
Randall Ellis: Risk-based comprehensive payment for primary careRandall Ellis: Risk-based comprehensive payment for primary care
Randall Ellis: Risk-based comprehensive payment for primary care
 
Advanced statistical manual part ii
Advanced statistical manual part iiAdvanced statistical manual part ii
Advanced statistical manual part ii
 
Nephelometry instrumentation presentation
Nephelometry instrumentation presentationNephelometry instrumentation presentation
Nephelometry instrumentation presentation
 
Health informationexchangeacrossus healthinstitution (1)
Health informationexchangeacrossus healthinstitution (1)Health informationexchangeacrossus healthinstitution (1)
Health informationexchangeacrossus healthinstitution (1)
 
Clinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseClinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_Disease
 
Intranasal Delivery of Oximes to Organophosphate exposed Rats
Intranasal Delivery of Oximes to Organophosphate exposed RatsIntranasal Delivery of Oximes to Organophosphate exposed Rats
Intranasal Delivery of Oximes to Organophosphate exposed Rats
 
Heart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning AlgorithmHeart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning Algorithm
 
Hybrid Technique for Associative Classification of Heart Diseases
Hybrid Technique for Associative Classification of Heart DiseasesHybrid Technique for Associative Classification of Heart Diseases
Hybrid Technique for Associative Classification of Heart Diseases
 
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
Applying Genetic Algorithms to Information Retrieval Using Vector Space ModelApplying Genetic Algorithms to Information Retrieval Using Vector Space Model
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
 

Ähnlich wie Multi-Objective Optimization for Clustering of Medical Publications

Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningIJCSIS Research Publications
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Alexander Decker
 
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASEMISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASEIJDKP
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Dimitris Papadopoulos
 
Multivariate sample similarity measure for feature selection with a resemblan...
Multivariate sample similarity measure for feature selection with a resemblan...Multivariate sample similarity measure for feature selection with a resemblan...
Multivariate sample similarity measure for feature selection with a resemblan...IJECEIAES
 
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDY
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDYUSING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDY
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDYijcsa
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSivagowry Shathesh
 
Cancer prognosis prediction using balanced stratified sampling
Cancer prognosis prediction using balanced stratified samplingCancer prognosis prediction using balanced stratified sampling
Cancer prognosis prediction using balanced stratified samplingijscai
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...IJDKP
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...IJDKP
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...IJDKP
 
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...BRNSSPublicationHubI
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...IJMER
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
Heart disease classification using Random Forest
Heart disease classification using Random ForestHeart disease classification using Random Forest
Heart disease classification using Random ForestIRJET Journal
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
 

Ähnlich wie Multi-Objective Optimization for Clustering of Medical Publications (20)

MTörnblom-Final
MTörnblom-FinalMTörnblom-Final
MTörnblom-Final
 
Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine Learning
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...
 
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASEMISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis)
 
Flacs vs mcs
Flacs vs mcsFlacs vs mcs
Flacs vs mcs
 
Multivariate sample similarity measure for feature selection with a resemblan...
Multivariate sample similarity measure for feature selection with a resemblan...Multivariate sample similarity measure for feature selection with a resemblan...
Multivariate sample similarity measure for feature selection with a resemblan...
 
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDY
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDYUSING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDY
USING ARTIFICIAL NEURAL NETWORK IN DIAGNOSIS OF THYROID DISEASE: A CASE STUDY
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
Cancer prognosis prediction using balanced stratified sampling
Cancer prognosis prediction using balanced stratified samplingCancer prognosis prediction using balanced stratified sampling
Cancer prognosis prediction using balanced stratified sampling
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
 
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
THE APPLICATION OF EXTENSIVE FEATURE EXTRACTION AS A COST STRATEGY IN CLINICA...
 
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...
Impact of Classification Algorithms on Cardiotocography Dataset for Fetal Sta...
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
 
Cbms07 Bratsas
Cbms07 BratsasCbms07 Bratsas
Cbms07 Bratsas
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Heart disease classification using Random Forest
Heart disease classification using Random ForestHeart disease classification using Random Forest
Heart disease classification using Random Forest
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
 

Mehr von Diego Molla-Aliod

Artificial Intelligence for Teachers Years 7-10
Artificial Intelligence for Teachers Years 7-10Artificial Intelligence for Teachers Years 7-10
Artificial Intelligence for Teachers Years 7-10Diego Molla-Aliod
 
Text Mining for Evidence Based Medicine
Text Mining for Evidence Based MedicineText Mining for Evidence Based Medicine
Text Mining for Evidence Based MedicineDiego Molla-Aliod
 
Impact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical DocumentsImpact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical DocumentsDiego Molla-Aliod
 
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...Diego Molla-Aliod
 
Macquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and HealthMacquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and HealthDiego Molla-Aliod
 
Development of a Corpus for Evidence Medicine Summarisation
Development of a Corpus for Evidence Medicine SummarisationDevelopment of a Corpus for Evidence Medicine Summarisation
Development of a Corpus for Evidence Medicine SummarisationDiego Molla-Aliod
 
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...Diego Molla-Aliod
 
Overview of the 2013 ALTA Shared Task
Overview of the 2013 ALTA Shared TaskOverview of the 2013 ALTA Shared Task
Overview of the 2013 ALTA Shared TaskDiego Molla-Aliod
 
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...Diego Molla-Aliod
 
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Diego Molla-Aliod
 
Graph-based Question Answering
Graph-based Question AnsweringGraph-based Question Answering
Graph-based Question AnsweringDiego Molla-Aliod
 

Mehr von Diego Molla-Aliod (11)

Artificial Intelligence for Teachers Years 7-10
Artificial Intelligence for Teachers Years 7-10Artificial Intelligence for Teachers Years 7-10
Artificial Intelligence for Teachers Years 7-10
 
Text Mining for Evidence Based Medicine
Text Mining for Evidence Based MedicineText Mining for Evidence Based Medicine
Text Mining for Evidence Based Medicine
 
Impact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical DocumentsImpact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical Documents
 
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
 
Macquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and HealthMacquarie University Workshop on Text Mining and Health
Macquarie University Workshop on Text Mining and Health
 
Development of a Corpus for Evidence Medicine Summarisation
Development of a Corpus for Evidence Medicine SummarisationDevelopment of a Corpus for Evidence Medicine Summarisation
Development of a Corpus for Evidence Medicine Summarisation
 
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
 
Overview of the 2013 ALTA Shared Task
Overview of the 2013 ALTA Shared TaskOverview of the 2013 ALTA Shared Task
Overview of the 2013 ALTA Shared Task
 
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...
Automatic Prediction of Evidence-based Recommendations via Sentence-level Pol...
 
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
 
Graph-based Question Answering
Graph-based Question AnsweringGraph-based Question Answering
Graph-based Question Answering
 

Kürzlich hochgeladen

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Multi-Objective Optimization for Clustering of Medical Publications

  • 1. Multi-Objective Optimization for Clustering of Medical Publications Asif Ekbal1 Sriparna Saha1 India Institute of Technology1 Patna, Bihar, India Diego Moll´2 a K Ravikumar1 Centre for Language Technology2 Macquarie University Sydney, Australia ALTA 2013, Brisbane, Australia
  • 2. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Contents Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 2/26
  • 3. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Contents Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 3/26
  • 4. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Evidence Based Medicine http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/ MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 4/26
  • 5. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results The Dream MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 5/26
  • 6. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results The Bottom-line Answer MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 6/26
  • 7. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results A Means of Getting There Output Input QUESTION: Which treatments work best for hemorrhoids? DOCUMENTS: [11289288] [12972967] [1442682] [15486746] [16235372] [16252313] [17054255] [17380367] clustering =⇒ summarisation 1. Excision is the most effective treatment for thrombosed external hemorrhoids. [11289288] [12972967] [15486746] 2. For prolapsed internal hemorrhoids, the best definitive treatment is traditional hemorrhoidectomy. [17054255] [17380367] 3. Of nonoperative techniques, rubber band ligation produces the lowest rate of recurrence. [1442682] [16252313] [16235372] MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 7/26
  • 8. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results This Work Each question is formulated as an independent clustering task. Input Output QUESTION: Which treatments work best for hemorrhoids? DOCUMENTS: [11289288] [12972967] [1442682] [15486746] [16235372] [16252313] [17054255] [17380367] clustering =⇒ MOO for Medical Clustering 1. [11289288] [12972967] [15486746] 2. [17054255] [17380367] 3. [1442682] [16252313] [16235372] Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 8/26
  • 9. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Related Work Uses of Document Clustering Clustering in EBM Web search Cluster search results Topic detection and tracking Cluster based on interventions Training data expansion Shash & Molla (2013): k-means clustering on our data set Multi-document summarisation MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 9/26
  • 10. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Contents Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 10/26
  • 11. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Clustering and Multi-Objective Optimization Most existing clustering techniques are based on a single criterion of goodness. Several criteria of goodness have been proposed. So why not try several criteria at once? Internal Validity External Validity BIC-index CH-index Minkowski scores Silhouette-index F-measures DB-index ... ... MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 11/26
  • 12. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Information in Internal Validity Indices Compactness Measures the distance among the various elements of the cluster. We want clusters with short distances between its elements. Separability Measures the distance between clusters. We want relatively large distances between clusters. MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 12/26
  • 13. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results I -Index (Maulik & Bandyopadhyay, 2002) I (K ) = ( K EK DK cj xk j nk E1 EK = = = = = = 1 E1 × × DK )p K EK number of clusters nk K k k=1 j=1 de (c k , x j ) K maxi,j=1 de (c i , c j ) centroid of the jth cluster jth point of the kth cluster total number of points present in the kth cluster increases I as the clusters become more compact. DK increases I as the separation between clusters increase. (p is a parameter set to 2 in this paper) MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 13/26
  • 14. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results XB-Index (Xie & Beni, 1991) XB(K ) = K cj xk j n [uij ]K ×n = = = = = K i=1 n 2 j=1 uij xj − ci n(mini=k c i − c k 2 2) number of clusters centroid of the jth cluster jth point of the kth cluster total number of points present in the dataset cluster membership matrix The numerator quantifies the compactness of the clusters. The denominator quantifies the separation between clusters. MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 14/26
  • 15. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO: The Pareto Optimal Front f2(minimize) 2 4 1 5 3 f1(maximize) MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 15/26
  • 16. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Contents Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 16/26
  • 17. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results String Representation AMOSA-clus implements simulated annealing (SA). Centroid-based real-encoding: Each member of the archive is encoded as a string that represents the centroids of the partitions. Each centroid is indivisible. Given a fixed maximum number of clusters Kmax , the initial number of centroids and their centroids are determined randomly. < 12.3 1.4 22.1 0.01 0.0 15.3 10.2 7.5 > Represents four cluster centroids: (12.3, 1.4), (22.1, 0.01), (0.0, 15.3), (10.2, 7.5) MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 17/26
  • 18. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Assignment of Points to the Clusters Assignment of points and update of cluster centroids resembles an iteration of the K -means clustering algorithm. 1. A point j is assigned to the cluster k whose centroid has the minimum distance to j: k = argmini=1,...K d(x j , c i ) (1) 2. After all points are assigned to a cluster, the cluster centroids are updated: ci = MOO for Medical Clustering ni i j=1 (x j ) ni , 1≤i ≤K Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a (2) 18/26
  • 19. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Search Operators Mutation 1 Perturb the centroids of a random cluster using a Laplacian distribution: p( ) ∝ e − | −µ| δ Mutation 2 Delete a random cluster centroid. Mutation 3 Add a new cluster centroid. < 3.5 1.5 2.1 4.9 1.6 1.2 > 1. If we choose centroid 2, then update centroid (2.1, 4.9). The new string is: < 3.5 1.5 1.2 3.6 1.6 1.2 > 2. If we choose centroid 3, the new string will be: < 3.5 1.5 2.1 4.9 >. 3. New string: < 3.5 1.5 2.1 4.9 MOO for Medical Clustering 1.6 1.2 9.7 2.5 > Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 19/26
  • 20. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Selecting a Solution The algorithm produces a set of alternative solutions. Each solution is optimal according to some criteria. Unsupervised Setting Semi-supervised Setting Choose one solution randomly. f2(minimize) 2 Select the solution with best entropy in known assignments. 4 1 Each question has a portion of known clustering assignments. 5 3 f1(maximize) MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 20/26
  • 21. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Contents Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 21/26
  • 22. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Data Clinical Inquiries from the Journal of Family Practice. 276 clinical questions (276 clustering tasks). Each question has an average of 5.89 documents. Which treatments work best for hemorrhoids? 1. Excision is the most effective treatment for thrombosed external hemorrhoids. [11289288] [12972967] [15486746] 2. For prolapsed internal hemorrhoids, the best definitive treatment is traditional hemorrhoidectomy. [17054255] [17380367] 3. Of nonoperative techniques, rubber band ligation produces the lowest rate of recurrence. [1442682] [16252313] [16235372] MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 22/26
  • 23. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Results Distance Measure AMOSA-clus1 AMOSA-clus2 best average best average K-means (baseline) Euclidean Cosine 0.190 0.187 0.249 0.231 0.177 0.177 0.235 0.230 0.240 0.237 Unsupervised: Average solution is slightly better than baseline (differences statistically significant). Semi-supervised: Best solution is clearly better than baseline (differences statistically significant). MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 23/26
  • 24. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Finding the Number of Clusters Distance Measure AMOSA-clus1 AMOSA-clus2 best average best average K-means (baseline) Euclidean Cosine 0.190 0.187 0.249 0.231 0.177 0.177 0.235 0.230 0.240 0.237 AMOSA-clus1: Number of clusters as given by the original data. Average 2.38 clusters. AMOSA-clus2: Try several numbers of clusters and select the solution that optimises I -index and XB-index. Euclidean distance: Average 2.34 clusters. Cosine distance: Average 2.51 clusters. MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 24/26
  • 25. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Finding the Number of Clusters error = − predictedi )2 # of questions i (targeti Method Error AMOSA-clus2 Cosine AMOSA-clus2 Euclidean k=1 k=2 k=3 k=4 Rule of Thumb Cover 1.90 1.91 3.91 2.14 2.38 4.61 2.56 1.98 MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 25/26
  • 26. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Conclusions Conclusions Unsupervised setting: slight improvement over k-means baseline. Semi-supervised setting: clear improvement over k-means baseline. Number of clusters: better than standard methods. Further Work Test on other domains. Test using other cluster validity indices. Compare with other semi-supervised methods. MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 26/26
  • 27. Clustering for Evidence Based Medicine Clustering as a MOO Problem AMOSA-clus Results Conclusions Conclusions Unsupervised setting: slight improvement over k-means baseline. Semi-supervised setting: clear improvement over k-means baseline. Number of clusters: better than standard methods. Further Work Test on other domains. Test using other cluster validity indices. Compare with other semi-supervised methods. Questions? MOO for Medical Clustering Asif Ekbal, Sriparna Saha, Diego Moll´, K Ravikumar a 26/26