SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Learning from (dis)similarity data
Nathalie Vialaneix
nathalie.vialaneix@inra.fr
http://www.nathalievialaneix.eu
MelbURN 2018
July 16th, 2018 - Melbourne, Australia
Nathalie Vialaneix | Learning from (dis)similarity data 1/24
What are my data like?
Nathalie Vialaneix | Learning from (dis)similarity data 2/24
A medieval social network [Boulet et al., 2008, Rossi et al., 2013]
corpus with more than 6,000
transactions, 3 centuries, all
related to
Castelnau Montratier
Nathalie Vialaneix | Learning from (dis)similarity data 3/24
A medieval social network [Boulet et al., 2008, Rossi et al., 2013]
corpus with more than 6,000
transactions, 3 centuries, all
related to
Castelnau Montratier
Individual
Transaction
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
Ratier
Ratier (II) Castelnau
Jean Laperarede
Bertrande Audoy
Gailhard Gourdon
Guy Moynes (de)
Pierre Piret (de)
Bernard Audoy
Hélène Castelnau
Guiral Baro
Bernard Audoy
Arnaud Bernard Laperarede
Guilhem Bernard Prestis
Jean Manas
Jean Laperarede
Jean Laperarede
Jean Roquefeuil
Jean Pojols
Ramond Belpech
Raymond Laperarede
Bertrand Prestis (de)
Ratier
(Monseigneur) Roquefeuil (de)
Guilhem Bernard Prestis
Arnaud Gasbert Castanhier (del)
Ratier (III) Castelnau
Pierre Prestis (de)
P Valeribosc
Guillaume Marsa
Berenguier Roquefeuil
Arnaud Bernard Perarede
Jean Roquefeuil
Arnaud I Audoy
Arnaud Bernard Perarede
bipartite network with more than 17,000
nodes (∼ 10,000 individuals)
What can we learn from the French
medieval society?
Nathalie Vialaneix | Learning from (dis)similarity data 3/24
Career paths [Olteanu and Villa-Vialaneix, 2015]
Survey “Génération 98”: labor market
status (9 categories) on more than
16,000 people having graduated in 1998
during 94 months. 1
1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH).
Nathalie Vialaneix | Learning from (dis)similarity data 4/24
Career paths [Olteanu and Villa-Vialaneix, 2015]
Survey “Génération 98”: labor market
status (9 categories) on more than
16,000 people having graduated in 1998
during 94 months. 1
How to cluster career paths into
homogeneous groups?
1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH).
Nathalie Vialaneix | Learning from (dis)similarity data 4/24
Career paths [Olteanu and Villa-Vialaneix, 2015]
Survey “Génération 98”: labor market
status (9 categories) on more than
16,000 people having graduated in 1998
during 94 months. 1
How to cluster career paths into
homogeneous groups?
It is all about distance...
χ2
dissimilarity emphasizes the
contemporary identical situations
Optimal-matching dissimilarities is
more focused on the sequences
similarities
[Needleman and Wunsch, 1970]
(or “edit distance”, “Levenshtein
distance”)
1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH).
Nathalie Vialaneix | Learning from (dis)similarity data 4/24
and then I went into NGS data...
and again...
distances are everywhere
Nathalie Vialaneix | Learning from (dis)similarity data 5/24
a collection of NGS data...
DNA barcoding
Astraptes fulgerator
optimal matching
(edit) distances to
differentiate species
Nathalie Vialaneix | Learning from (dis)similarity data 6/24
a collection of NGS data...
DNA barcoding
Astraptes fulgerator
optimal matching
(edit) distances to
differentiate species
Hi-C data
pairwise measure (similarity) related to
the physical 3D distance between loci in
the cell, at genome scale
Nathalie Vialaneix | Learning from (dis)similarity data 6/24
a collection of NGS data...
DNA barcoding
Astraptes fulgerator
optimal matching
(edit) distances to
differentiate species
Hi-C data
pairwise measure (similarity) related to
the physical 3D distance between loci in
the cell, at genome scale
Metagenomics
dissemblance between
samples is better
captured when
phylogeny between
species is taken into
account (unifrac
distances)
Nathalie Vialaneix | Learning from (dis)similarity data 6/24
Relational Self-Organizing Map
algorithm
Nathalie Vialaneix | Learning from (dis)similarity data 7/24
Basics on (standard) stochastic SOM
[Kohonen, 2001]
x
x
x
(xi)i=1,...,n ⊂ Rd
are affected to a unit f(xi) ∈ {1, . . . , U}
the grid is equipped with a “distance” between units: d(u, u ) and
observations affected to close units are close in Rd
every unit u corresponds to a prototype, pu (x) in Rd
Nathalie Vialaneix | Learning from (dis)similarity data 8/24
Basics on (standard) stochastic SOM
[Kohonen, 2001]
x
x
x
Iterative learning (assignment step): xi is picked at random within (xk )k
and affected to best matching unit:
ft
(xi) = arg min
u
xi − pt
u
2
Nathalie Vialaneix | Learning from (dis)similarity data 8/24
Basics on (standard) stochastic SOM
[Kohonen, 2001]
x
x
x
Iterative learning (representation step): all prototypes in neighboring units
are updated with a gradient descent like step:
pt+1
u ←− pt
u + µ(t)Ht
(d(f(xi), u))(xi − pt
u)
Nathalie Vialaneix | Learning from (dis)similarity data 8/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ Rd
1: Initialization:
randomly set p0
1
, ..., p0
U
in Rd
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
xi − pt
u
2
βt
u
5: for all u = 1 → U do Representation
6:
pt+1
u = pt
u + µ(t)Ht
(d(ft
(xi), u))
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ X
1: Initialization:
randomly set p0
1
, ..., p0
U
in Rd
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
xi − pt
u
2
βt
u
5: for all u = 1 → U do Representation
6:
pt+1
u = pt
u + µ(t)Ht
(d(ft
(xi), u))
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ X
1: Initialization:
p0
u ∼ n
i=1 β0
ui
xi (convex combination)
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
xi − pt
u
2
βt
u
5: for all u = 1 → U do Representation
6:
pt+1
u = pt
u + µ(t)Ht
(d(ft
(xi), u))
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ X
1: Initialization:
p0
u ∼ n
i=1 β0
ui
xi (convex combination)
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
βt
uD(pt
u, xi)
5: for all u = 1 → U do Representation
6:
pt+1
u = pt
u + µ(t)Ht
(d(ft
(xi), u))
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ X
1: Initialization:
p0
u ∼ n
i=1 β0
ui
xi (convex combination)
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
βt
uD(pt
u, xi)
5: for all u = 1 → U do Representation
6:
pt+1
u = pt
u + µ(t)Ht
(d(ft
(xi), u)) ∼ xi − pt
u
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Extension of SOM to data described by a kernel or a
dissimilarity
[Olteanu and Villa-Vialaneix, 2015]
Data: (xi)i=1,...,n ∈ X
1: Initialization:
p0
u ∼ n
i=1 β0
ui
xi (convex combination)
2: for t = 1 → T do
3: pick at random i ∈ {1, . . . , n}
4: Assignment
ft
(xi) = arg min
u=1,...,U
βt
u(βt
u) D(., xi) −
1
2
(βt
u) Dβt
u
5: for all u = 1 → U do Representation
6:
βt+1
u = βt
u + µ(t)Ht
(d(ft
(xi), u)) 1i − βt
u
7: end for
8: end for
Nathalie Vialaneix | Learning from (dis)similarity data 9/24
Note on drawbacks of RSOM
Two main drawbacks:
For T ∼ γn iterations, complexity of RSOM is O(γn3
U) (compared to
O(γUdn) for numeric) [Rossi, 2014]
Nathalie Vialaneix | Learning from (dis)similarity data 10/24
Note on drawbacks of RSOM
Two main drawbacks:
For T ∼ γn iterations, complexity of RSOM is O(γn3
U) (compared to
O(γUdn) for numeric) [Rossi, 2014]
Exact solution proposed in [Mariette et al., 2017] to reduce the
complexity to O(γn2
U) with additional storage memory of O(Un)
Nathalie Vialaneix | Learning from (dis)similarity data 10/24
Note on drawbacks of RSOM
Two main drawbacks:
For T ∼ γn iterations, complexity of RSOM is O(γn3
U) (compared to
O(γUdn) for numeric) [Rossi, 2014]
Exact solution proposed in [Mariette et al., 2017] to reduce the
complexity to O(γn2
U) with additional storage memory of O(Un)
For the non Euclidean case, the learning algorithm can be very
unstable (saddle points)
Nathalie Vialaneix | Learning from (dis)similarity data 10/24
Note on drawbacks of RSOM
Two main drawbacks:
For T ∼ γn iterations, complexity of RSOM is O(γn3
U) (compared to
O(γUdn) for numeric) [Rossi, 2014]
Exact solution proposed in [Mariette et al., 2017] to reduce the
complexity to O(γn2
U) with additional storage memory of O(Un)
For the non Euclidean case, the learning algorithm can be very
unstable (saddle points)
clip or flip? [Chen et al., 2009]
Nathalie Vialaneix | Learning from (dis)similarity data 10/24
SOMbrero
[Villa-Vialaneix, 2017]
SOMbrero is an R package implementing stochastic variants of SOM
for non vectorial data
Specifically well adapted to...
non expert use and teaching
use with graphs and obtain simplified representations
first release: March 2013; latest release: Feb. 2018 (version 1.2.3)
depends on R (version ≥ 3.1.0) http://www.r-project.org
and on several packages available on CRAN:
wordcloud, igraph, RColorBrewer, scatterplot3d, knitr, shiny
available at https://cran.r-project.org/package=SOMbrero
(licence GPL) and can be installed from inside R using
install.packages("SOMbrero")
Nathalie Vialaneix | Learning from (dis)similarity data 11/24
Training
mysom <- trainSOM(iris[ ,1:4], ...)
Options to train the SOM:
grid: square grid, with arbitrary width and length
distance between units: standard distances as in dist or "letremy" (Euclidean then
"maximum")
neighborhood relationship: Gaussian or "letremy"
prototypes: initialized randomly, with a PCA, with random observations from the training
sample
preprocessing: centering, scaling to unit variance or nothing
training: number of iterations, standard or Heskes’s assignment step
ft
(xi) ← arg min
u=1,...,U
U
u =1
Ht
(d(u, u )) xi − pt−1
u
2
Nathalie Vialaneix | Learning from (dis)similarity data 12/24
Diagnostic tools
quality(mysom)
topographic error: average frequency (over the samples) for which the
prototypes that comes closest is in the direct neighborhood on the
grid of the BMU
quantization error
Q =
1
n
n
i=1
xi − pf(xi)
2
Nathalie Vialaneix | Learning from (dis)similarity data 13/24
Plots...
plot(mysom,
what = c("observations", "prototypes", "add"),
type = ..., ...)
Nathalie Vialaneix | Learning from (dis)similarity data 14/24
Super-clustering
mysom.sc <- superClass(mysom)
Nathalie Vialaneix | Learning from (dis)similarity data 15/24
Start with SOMbrero
3 datasets corresponding to the three types of data that SOMbrero
can handle (iris, presidentielles2002 and lesmis, a graph from
“Les Misérables”)
Nathalie Vialaneix | Learning from (dis)similarity data 16/24
Start with SOMbrero
3 datasets corresponding to the three types of data that SOMbrero
can handle (iris, presidentielles2002 and lesmis, a graph from
“Les Misérables”)
comprehensive (HTML) vignettes included in the package and
available on the website
Nathalie Vialaneix | Learning from (dis)similarity data 16/24
Start with SOMbrero
3 datasets corresponding to the three types of data that SOMbrero
can handle (iris, presidentielles2002 and lesmis, a graph from
“Les Misérables”)
comprehensive (HTML) vignettes included in the package and
available on the website
Web User Interface (made with shiny) for using the package even if
you do not know R programming language (included in the package
with sombreroGUI() Tested and approved on an historian!
Nathalie Vialaneix | Learning from (dis)similarity data 16/24
RSOM for mining a medieval social network
with the heat kernel
Individual
Transaction
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
Ratier
Ratier (II) Castelnau
Jean Laperarede
Bertrande Audoy
Gailhard Gourdon
Guy Moynes (de)
Pierre Piret (de)
Bernard Audoy
Hélène Castelnau
Guiral Baro
Bernard Audoy
Arnaud Bernard Laperarede
Guilhem Bernard Prestis
Jean Manas
Jean Laperarede
Jean Laperarede
Jean Roquefeuil
Jean Pojols
Ramond Belpech
Raymond Laperarede
Bertrand Prestis (de)
Ratier
(Monseigneur) Roquefeuil (de)
Guilhem Bernard Prestis
Arnaud Gasbert Castanhier (del)
Ratier (III) Castelnau
Pierre Prestis (de)
P Valeribosc
Guillaume Marsa
Berenguier Roquefeuil
Arnaud Bernard Perarede
Jean Roquefeuil
Arnaud I Audoy
Arnaud Bernard Perarede
[Boulet et al., 2008]
Graph induced by clusters:
has nice relations with space and time
emphasizes leading people
has helped to identify problems in the
database (namesakes)
But: biggest communities are still
very complex
Nathalie Vialaneix | Learning from (dis)similarity data 17/24
RSOM for typology of Astraptes fulgerator from DNA
barcoding
Edit distances between DNA sequences [Olteanu and Villa-Vialaneix, 2015]
Almost perfect clustering (identifying a possible label error on one sample)
with (in addition) information on relations between species.
Nathalie Vialaneix | Learning from (dis)similarity data 18/24
RSOM for typology of school-to-time transitions
Edit distance between 12,000 categorical time series
Nathalie Vialaneix | Learning from (dis)similarity data 19/24
Also in SOMbrero: KORRESP
[Cottrell and Letrémy, 2005]
Data: contingency table T = (nij)ij with p rows and q columns transformed
into a numeric dataset X:
X =
columns rows
columns
rows
column profile
row profile
with
∀ i = 1, . . . , p and ∀ j = 1, . . . , q, xij =
nij
ni.
× n
n.j
Nathalie Vialaneix | Learning from (dis)similarity data 20/24
Also in SOMbrero: KORRESP
[Cottrell and Letrémy, 2005]
Data: contingency table T = (nij)ij with p rows and q columns transformed
into a numeric dataset X:
X =
columns rows
columns
rows
augmented
column profile
augmented row
profile
with
∀ i = 1, . . . , p and ∀ j = q + 1, . . . , q + p, xij = xk(i)+p,j with
k(i) = arg maxk=1,...,q xik
Nathalie Vialaneix | Learning from (dis)similarity data 20/24
Also in SOMbrero: KORRESP
[Cottrell and Letrémy, 2005]
Data: contingency table T = (nij)ij with p rows and q columns transformed
into a numeric dataset X:
X =
columns rows
columns
rows
augmented
column profile
augmented row
profile
column profile
row profile
assignment uses reduced profile
representation uses augmented profile
alternatively process row profiles and column profiles
Nathalie Vialaneix | Learning from (dis)similarity data 20/24
Also available in SOMbrero
mysom <- trainSOM(presidentielles2002 , type = "korresp")
plot(mysom, what = "obs", type = "names")
Nathalie Vialaneix | Learning from (dis)similarity data 21/24
SOMbrero
Madalina Olteanu,
Fabrice Rossi, Marie Cottrell,
Laura Bendhaïba and
Julien Boelaert
SOMbrero and mixKernel
Jérôme Mariette
adjclust
Pierre Neuvial, Guillem Rigail, Christophe Ambroise and
Shubham Chaturvedi
Nathalie Vialaneix | Learning from (dis)similarity data 22/24
Don’t miss useR! 2019
user2019.r-project.org
Nathalie Vialaneix | Learning from (dis)similarity data 23/24
Credits for pictures
Slide 2: Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae,
Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
Slide 3: Picture of Castelnau Montratier from
https://commons.wikimedia.org/wiki/File:
Place_Gambetta,_Castelnau-Montratier.JPG by Duch.seb CC BY-SA 3.0
Slide 4: image based on ENCODE project, by Darryl Leja (NHGRI), Ian Dunham
(EBI) and Michael Pazin (NHGRI)
Slide 6: Astraptes picture is from
https://www.flickr.com/photos/39139121@N00/2045403823/ by Anne Toal
(CC BY-SA 2.0), Hi-C experiment is taken from the article Matharu et al., 2015
DOI:10.1371/journal.pgen.1005640 (CC BY-SA 4.0) and metagenomics illustration is
taken from the article Sommer et al., 2010 DOI:10.1038/msb.2010.16 (CC BY-NC-SA
3.0)
Slide 12: TADS picture is from the article Fraser et al., 2015
DOI:10.15252/msb.20156492 (CC BY-SA 4.0)
Nathalie Vialaneix | Learning from (dis)similarity data 24/24
References
Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).
Batch kernel SOM and related Laplacian methods for social network analysis.
Neurocomputing, 71(7-9):1257–1273.
Chen, Y., Garcia, E., Gupta, M., Rahimi, A., and Cazzanti, L. (2009).
Similarity-based classification: concepts and algorithm.
Journal of Machine Learning Research, 10:747–776.
Cottrell, M. and Letrémy, P. (2005).
How to use the Kohonen algorithm to simultaneously analyse individuals in a survey.
Neurocomputing, 63:193–207.
Kohonen, T. (2001).
Self-Organizing Maps, 3rd Edition, volume 30.
Springer, Berlin, Heidelberg, New York.
Mariette, J., Rossi, F., Olteanu, M., and Villa-Vialaneix, N. (2017).
Accelerating stochastic kernel som.
In Verleysen, M., editor, XXVth European Symposium on Artificial Neural Networks, Computational Intelligence and Machine
Learning (ESANN 2017), pages 269–274, Bruges, Belgium. i6doc.
Needleman, S. and Wunsch, C. (1970).
A general method applicable to the search for similarities in the amino acid sequence of two proteins.
Journal of Molecular Biology, 48(3):443–453.
Olteanu, M. and Villa-Vialaneix, N. (2015).
On-line relational and multiple relational SOM.
Neurocomputing, 147:15–30.
Rossi, F. (2014).
How many dissimilarity/kernel self organizing map variants do we need?
Nathalie Vialaneix | Learning from (dis)similarity data 24/24
In Villmann, T., Schleif, F., Kaden, M., and Lange, M., editors, Advances in Self-Organizing Maps and Learning Vector
Quantization (Proceedings of WSOM 2014), volume 295 of Advances in Intelligent Systems and Computing, pages 3–23,
Mittweida, Germany. Springer Verlag, Berlin, Heidelberg.
Rossi, F., Villa-Vialaneix, N., and Hautefeuille, F. (2013).
Exploration of a large database of French notarial acts with social network methods.
Digital Medievalist, 9.
Villa-Vialaneix, N. (2017).
Stochastic self-organizing map variants with the R package SOMbrero.
In Lamirel, J., Cottrell, M., and Olteanu, M., editors, 12th International Workshop on Self-Organizing Maps and Learning Vector
Quantization, Clustering and Data Visualization (Proceedings of WSOM 2017), Nancy, France. IEEE.
Nathalie Vialaneix | Learning from (dis)similarity data 24/24

Weitere ähnliche Inhalte

Was ist angesagt?

'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysistuxette
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishtuxette
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learningtuxette
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Datatuxette
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIRtuxette
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelstuxette
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysistuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GANSEMINARGROOT
 
Multiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing MapsMultiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing Mapstuxette
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learningUniversity of Groningen
 
Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplestuxette
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewMohamed Farouk
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesGautier Marti
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetstuxette
 
Probability distributions for ml
Probability distributions for mlProbability distributions for ml
Probability distributions for mlSung Yub Kim
 

Was ist angesagt? (20)

'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis'ACCOST' for differential HiC analysis
'ACCOST' for differential HiC analysis
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Random Forest for Big Data
Random Forest for Big DataRandom Forest for Big Data
Random Forest for Big Data
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
Investigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysisInvestigating the 3D structure of the genome with Hi-C data analysis
Investigating the 3D structure of the genome with Hi-C data analysis
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
Generative models : VAE and GAN
Generative models : VAE and GANGenerative models : VAE and GAN
Generative models : VAE and GAN
 
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
 
Multiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing MapsMultiple kernel Self-Organizing Maps
Multiple kernel Self-Organizing Maps
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
Consensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samplesConsensual gene co-expression network inference with multiple samples
Consensual gene co-expression network inference with multiple samples
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time SeriesAutoregressive Convolutional Neural Networks for Asynchronous Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
 
Presentation
PresentationPresentation
Presentation
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 
Probability distributions for ml
Probability distributions for mlProbability distributions for ml
Probability distributions for ml
 

Ähnlich wie Learning from (dis)similarity data

Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Umberto Picchini
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsChristian Robert
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional datatuxette
 
Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Gautier Marti
 
Numerical methods for variational principles in traffic
Numerical methods for variational principles in trafficNumerical methods for variational principles in traffic
Numerical methods for variational principles in trafficGuillaume Costeseque
 
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Guillaume Costeseque
 
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Christian Robert
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...tuxette
 
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Infrastructure Facility
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbolsAxel de Romblay
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationArthur Mensch
 
Supervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSupervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSina Nakhostin
 
Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Wavelet-based Reflection Symmetry Detection via Textural and Color HistogramsWavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Wavelet-based Reflection Symmetry Detection via Textural and Color HistogramsMohamed Elawady
 

Ähnlich wie Learning from (dis)similarity data (20)

Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
Interpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional dataInterpretable Sparse Sliced Inverse Regression for digitized functional data
Interpretable Sparse Sliced Inverse Regression for digitized functional data
 
Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...Clustering Financial Time Series using their Correlations and their Distribut...
Clustering Financial Time Series using their Correlations and their Distribut...
 
Numerical methods for variational principles in traffic
Numerical methods for variational principles in trafficNumerical methods for variational principles in traffic
Numerical methods for variational principles in traffic
 
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
 
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...
 
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbols
 
Classification
ClassificationClassification
Classification
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Supervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSupervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal Transport
 
Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Wavelet-based Reflection Symmetry Detection via Textural and Color HistogramsWavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms
 

Mehr von tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNNtuxette
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practicetuxette
 

Mehr von tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 

Kürzlich hochgeladen

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsNurulAfiqah307317
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 

Kürzlich hochgeladen (20)

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 

Learning from (dis)similarity data

  • 1. Learning from (dis)similarity data Nathalie Vialaneix nathalie.vialaneix@inra.fr http://www.nathalievialaneix.eu MelbURN 2018 July 16th, 2018 - Melbourne, Australia Nathalie Vialaneix | Learning from (dis)similarity data 1/24
  • 2. What are my data like? Nathalie Vialaneix | Learning from (dis)similarity data 2/24
  • 3. A medieval social network [Boulet et al., 2008, Rossi et al., 2013] corpus with more than 6,000 transactions, 3 centuries, all related to Castelnau Montratier Nathalie Vialaneix | Learning from (dis)similarity data 3/24
  • 4. A medieval social network [Boulet et al., 2008, Rossi et al., 2013] corpus with more than 6,000 transactions, 3 centuries, all related to Castelnau Montratier Individual Transaction q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Ratier Ratier (II) Castelnau Jean Laperarede Bertrande Audoy Gailhard Gourdon Guy Moynes (de) Pierre Piret (de) Bernard Audoy Hélène Castelnau Guiral Baro Bernard Audoy Arnaud Bernard Laperarede Guilhem Bernard Prestis Jean Manas Jean Laperarede Jean Laperarede Jean Roquefeuil Jean Pojols Ramond Belpech Raymond Laperarede Bertrand Prestis (de) Ratier (Monseigneur) Roquefeuil (de) Guilhem Bernard Prestis Arnaud Gasbert Castanhier (del) Ratier (III) Castelnau Pierre Prestis (de) P Valeribosc Guillaume Marsa Berenguier Roquefeuil Arnaud Bernard Perarede Jean Roquefeuil Arnaud I Audoy Arnaud Bernard Perarede bipartite network with more than 17,000 nodes (∼ 10,000 individuals) What can we learn from the French medieval society? Nathalie Vialaneix | Learning from (dis)similarity data 3/24
  • 5. Career paths [Olteanu and Villa-Vialaneix, 2015] Survey “Génération 98”: labor market status (9 categories) on more than 16,000 people having graduated in 1998 during 94 months. 1 1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH). Nathalie Vialaneix | Learning from (dis)similarity data 4/24
  • 6. Career paths [Olteanu and Villa-Vialaneix, 2015] Survey “Génération 98”: labor market status (9 categories) on more than 16,000 people having graduated in 1998 during 94 months. 1 How to cluster career paths into homogeneous groups? 1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH). Nathalie Vialaneix | Learning from (dis)similarity data 4/24
  • 7. Career paths [Olteanu and Villa-Vialaneix, 2015] Survey “Génération 98”: labor market status (9 categories) on more than 16,000 people having graduated in 1998 during 94 months. 1 How to cluster career paths into homogeneous groups? It is all about distance... χ2 dissimilarity emphasizes the contemporary identical situations Optimal-matching dissimilarities is more focused on the sequences similarities [Needleman and Wunsch, 1970] (or “edit distance”, “Levenshtein distance”) 1. Available thanks to Génération 1998 à 7 ans - 2005, [producer] CEREQ, [diffusion] Centre Maurice Halbwachs (CMH). Nathalie Vialaneix | Learning from (dis)similarity data 4/24
  • 8. and then I went into NGS data... and again... distances are everywhere Nathalie Vialaneix | Learning from (dis)similarity data 5/24
  • 9. a collection of NGS data... DNA barcoding Astraptes fulgerator optimal matching (edit) distances to differentiate species Nathalie Vialaneix | Learning from (dis)similarity data 6/24
  • 10. a collection of NGS data... DNA barcoding Astraptes fulgerator optimal matching (edit) distances to differentiate species Hi-C data pairwise measure (similarity) related to the physical 3D distance between loci in the cell, at genome scale Nathalie Vialaneix | Learning from (dis)similarity data 6/24
  • 11. a collection of NGS data... DNA barcoding Astraptes fulgerator optimal matching (edit) distances to differentiate species Hi-C data pairwise measure (similarity) related to the physical 3D distance between loci in the cell, at genome scale Metagenomics dissemblance between samples is better captured when phylogeny between species is taken into account (unifrac distances) Nathalie Vialaneix | Learning from (dis)similarity data 6/24
  • 12. Relational Self-Organizing Map algorithm Nathalie Vialaneix | Learning from (dis)similarity data 7/24
  • 13. Basics on (standard) stochastic SOM [Kohonen, 2001] x x x (xi)i=1,...,n ⊂ Rd are affected to a unit f(xi) ∈ {1, . . . , U} the grid is equipped with a “distance” between units: d(u, u ) and observations affected to close units are close in Rd every unit u corresponds to a prototype, pu (x) in Rd Nathalie Vialaneix | Learning from (dis)similarity data 8/24
  • 14. Basics on (standard) stochastic SOM [Kohonen, 2001] x x x Iterative learning (assignment step): xi is picked at random within (xk )k and affected to best matching unit: ft (xi) = arg min u xi − pt u 2 Nathalie Vialaneix | Learning from (dis)similarity data 8/24
  • 15. Basics on (standard) stochastic SOM [Kohonen, 2001] x x x Iterative learning (representation step): all prototypes in neighboring units are updated with a gradient descent like step: pt+1 u ←− pt u + µ(t)Ht (d(f(xi), u))(xi − pt u) Nathalie Vialaneix | Learning from (dis)similarity data 8/24
  • 16. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ Rd 1: Initialization: randomly set p0 1 , ..., p0 U in Rd 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U xi − pt u 2 βt u 5: for all u = 1 → U do Representation 6: pt+1 u = pt u + µ(t)Ht (d(ft (xi), u)) 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 17. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ X 1: Initialization: randomly set p0 1 , ..., p0 U in Rd 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U xi − pt u 2 βt u 5: for all u = 1 → U do Representation 6: pt+1 u = pt u + µ(t)Ht (d(ft (xi), u)) 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 18. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ X 1: Initialization: p0 u ∼ n i=1 β0 ui xi (convex combination) 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U xi − pt u 2 βt u 5: for all u = 1 → U do Representation 6: pt+1 u = pt u + µ(t)Ht (d(ft (xi), u)) 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 19. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ X 1: Initialization: p0 u ∼ n i=1 β0 ui xi (convex combination) 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U βt uD(pt u, xi) 5: for all u = 1 → U do Representation 6: pt+1 u = pt u + µ(t)Ht (d(ft (xi), u)) 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 20. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ X 1: Initialization: p0 u ∼ n i=1 β0 ui xi (convex combination) 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U βt uD(pt u, xi) 5: for all u = 1 → U do Representation 6: pt+1 u = pt u + µ(t)Ht (d(ft (xi), u)) ∼ xi − pt u 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 21. Extension of SOM to data described by a kernel or a dissimilarity [Olteanu and Villa-Vialaneix, 2015] Data: (xi)i=1,...,n ∈ X 1: Initialization: p0 u ∼ n i=1 β0 ui xi (convex combination) 2: for t = 1 → T do 3: pick at random i ∈ {1, . . . , n} 4: Assignment ft (xi) = arg min u=1,...,U βt u(βt u) D(., xi) − 1 2 (βt u) Dβt u 5: for all u = 1 → U do Representation 6: βt+1 u = βt u + µ(t)Ht (d(ft (xi), u)) 1i − βt u 7: end for 8: end for Nathalie Vialaneix | Learning from (dis)similarity data 9/24
  • 22. Note on drawbacks of RSOM Two main drawbacks: For T ∼ γn iterations, complexity of RSOM is O(γn3 U) (compared to O(γUdn) for numeric) [Rossi, 2014] Nathalie Vialaneix | Learning from (dis)similarity data 10/24
  • 23. Note on drawbacks of RSOM Two main drawbacks: For T ∼ γn iterations, complexity of RSOM is O(γn3 U) (compared to O(γUdn) for numeric) [Rossi, 2014] Exact solution proposed in [Mariette et al., 2017] to reduce the complexity to O(γn2 U) with additional storage memory of O(Un) Nathalie Vialaneix | Learning from (dis)similarity data 10/24
  • 24. Note on drawbacks of RSOM Two main drawbacks: For T ∼ γn iterations, complexity of RSOM is O(γn3 U) (compared to O(γUdn) for numeric) [Rossi, 2014] Exact solution proposed in [Mariette et al., 2017] to reduce the complexity to O(γn2 U) with additional storage memory of O(Un) For the non Euclidean case, the learning algorithm can be very unstable (saddle points) Nathalie Vialaneix | Learning from (dis)similarity data 10/24
  • 25. Note on drawbacks of RSOM Two main drawbacks: For T ∼ γn iterations, complexity of RSOM is O(γn3 U) (compared to O(γUdn) for numeric) [Rossi, 2014] Exact solution proposed in [Mariette et al., 2017] to reduce the complexity to O(γn2 U) with additional storage memory of O(Un) For the non Euclidean case, the learning algorithm can be very unstable (saddle points) clip or flip? [Chen et al., 2009] Nathalie Vialaneix | Learning from (dis)similarity data 10/24
  • 26. SOMbrero [Villa-Vialaneix, 2017] SOMbrero is an R package implementing stochastic variants of SOM for non vectorial data Specifically well adapted to... non expert use and teaching use with graphs and obtain simplified representations first release: March 2013; latest release: Feb. 2018 (version 1.2.3) depends on R (version ≥ 3.1.0) http://www.r-project.org and on several packages available on CRAN: wordcloud, igraph, RColorBrewer, scatterplot3d, knitr, shiny available at https://cran.r-project.org/package=SOMbrero (licence GPL) and can be installed from inside R using install.packages("SOMbrero") Nathalie Vialaneix | Learning from (dis)similarity data 11/24
  • 27. Training mysom <- trainSOM(iris[ ,1:4], ...) Options to train the SOM: grid: square grid, with arbitrary width and length distance between units: standard distances as in dist or "letremy" (Euclidean then "maximum") neighborhood relationship: Gaussian or "letremy" prototypes: initialized randomly, with a PCA, with random observations from the training sample preprocessing: centering, scaling to unit variance or nothing training: number of iterations, standard or Heskes’s assignment step ft (xi) ← arg min u=1,...,U U u =1 Ht (d(u, u )) xi − pt−1 u 2 Nathalie Vialaneix | Learning from (dis)similarity data 12/24
  • 28. Diagnostic tools quality(mysom) topographic error: average frequency (over the samples) for which the prototypes that comes closest is in the direct neighborhood on the grid of the BMU quantization error Q = 1 n n i=1 xi − pf(xi) 2 Nathalie Vialaneix | Learning from (dis)similarity data 13/24
  • 29. Plots... plot(mysom, what = c("observations", "prototypes", "add"), type = ..., ...) Nathalie Vialaneix | Learning from (dis)similarity data 14/24
  • 30. Super-clustering mysom.sc <- superClass(mysom) Nathalie Vialaneix | Learning from (dis)similarity data 15/24
  • 31. Start with SOMbrero 3 datasets corresponding to the three types of data that SOMbrero can handle (iris, presidentielles2002 and lesmis, a graph from “Les Misérables”) Nathalie Vialaneix | Learning from (dis)similarity data 16/24
  • 32. Start with SOMbrero 3 datasets corresponding to the three types of data that SOMbrero can handle (iris, presidentielles2002 and lesmis, a graph from “Les Misérables”) comprehensive (HTML) vignettes included in the package and available on the website Nathalie Vialaneix | Learning from (dis)similarity data 16/24
  • 33. Start with SOMbrero 3 datasets corresponding to the three types of data that SOMbrero can handle (iris, presidentielles2002 and lesmis, a graph from “Les Misérables”) comprehensive (HTML) vignettes included in the package and available on the website Web User Interface (made with shiny) for using the package even if you do not know R programming language (included in the package with sombreroGUI() Tested and approved on an historian! Nathalie Vialaneix | Learning from (dis)similarity data 16/24
  • 34. RSOM for mining a medieval social network with the heat kernel Individual Transaction q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Ratier Ratier (II) Castelnau Jean Laperarede Bertrande Audoy Gailhard Gourdon Guy Moynes (de) Pierre Piret (de) Bernard Audoy Hélène Castelnau Guiral Baro Bernard Audoy Arnaud Bernard Laperarede Guilhem Bernard Prestis Jean Manas Jean Laperarede Jean Laperarede Jean Roquefeuil Jean Pojols Ramond Belpech Raymond Laperarede Bertrand Prestis (de) Ratier (Monseigneur) Roquefeuil (de) Guilhem Bernard Prestis Arnaud Gasbert Castanhier (del) Ratier (III) Castelnau Pierre Prestis (de) P Valeribosc Guillaume Marsa Berenguier Roquefeuil Arnaud Bernard Perarede Jean Roquefeuil Arnaud I Audoy Arnaud Bernard Perarede [Boulet et al., 2008] Graph induced by clusters: has nice relations with space and time emphasizes leading people has helped to identify problems in the database (namesakes) But: biggest communities are still very complex Nathalie Vialaneix | Learning from (dis)similarity data 17/24
  • 35. RSOM for typology of Astraptes fulgerator from DNA barcoding Edit distances between DNA sequences [Olteanu and Villa-Vialaneix, 2015] Almost perfect clustering (identifying a possible label error on one sample) with (in addition) information on relations between species. Nathalie Vialaneix | Learning from (dis)similarity data 18/24
  • 36. RSOM for typology of school-to-time transitions Edit distance between 12,000 categorical time series Nathalie Vialaneix | Learning from (dis)similarity data 19/24
  • 37. Also in SOMbrero: KORRESP [Cottrell and Letrémy, 2005] Data: contingency table T = (nij)ij with p rows and q columns transformed into a numeric dataset X: X = columns rows columns rows column profile row profile with ∀ i = 1, . . . , p and ∀ j = 1, . . . , q, xij = nij ni. × n n.j Nathalie Vialaneix | Learning from (dis)similarity data 20/24
  • 38. Also in SOMbrero: KORRESP [Cottrell and Letrémy, 2005] Data: contingency table T = (nij)ij with p rows and q columns transformed into a numeric dataset X: X = columns rows columns rows augmented column profile augmented row profile with ∀ i = 1, . . . , p and ∀ j = q + 1, . . . , q + p, xij = xk(i)+p,j with k(i) = arg maxk=1,...,q xik Nathalie Vialaneix | Learning from (dis)similarity data 20/24
  • 39. Also in SOMbrero: KORRESP [Cottrell and Letrémy, 2005] Data: contingency table T = (nij)ij with p rows and q columns transformed into a numeric dataset X: X = columns rows columns rows augmented column profile augmented row profile column profile row profile assignment uses reduced profile representation uses augmented profile alternatively process row profiles and column profiles Nathalie Vialaneix | Learning from (dis)similarity data 20/24
  • 40. Also available in SOMbrero mysom <- trainSOM(presidentielles2002 , type = "korresp") plot(mysom, what = "obs", type = "names") Nathalie Vialaneix | Learning from (dis)similarity data 21/24
  • 41. SOMbrero Madalina Olteanu, Fabrice Rossi, Marie Cottrell, Laura Bendhaïba and Julien Boelaert SOMbrero and mixKernel Jérôme Mariette adjclust Pierre Neuvial, Guillem Rigail, Christophe Ambroise and Shubham Chaturvedi Nathalie Vialaneix | Learning from (dis)similarity data 22/24
  • 42. Don’t miss useR! 2019 user2019.r-project.org Nathalie Vialaneix | Learning from (dis)similarity data 23/24
  • 43. Credits for pictures Slide 2: Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/ Slide 3: Picture of Castelnau Montratier from https://commons.wikimedia.org/wiki/File: Place_Gambetta,_Castelnau-Montratier.JPG by Duch.seb CC BY-SA 3.0 Slide 4: image based on ENCODE project, by Darryl Leja (NHGRI), Ian Dunham (EBI) and Michael Pazin (NHGRI) Slide 6: Astraptes picture is from https://www.flickr.com/photos/39139121@N00/2045403823/ by Anne Toal (CC BY-SA 2.0), Hi-C experiment is taken from the article Matharu et al., 2015 DOI:10.1371/journal.pgen.1005640 (CC BY-SA 4.0) and metagenomics illustration is taken from the article Sommer et al., 2010 DOI:10.1038/msb.2010.16 (CC BY-NC-SA 3.0) Slide 12: TADS picture is from the article Fraser et al., 2015 DOI:10.15252/msb.20156492 (CC BY-SA 4.0) Nathalie Vialaneix | Learning from (dis)similarity data 24/24
  • 44. References Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008). Batch kernel SOM and related Laplacian methods for social network analysis. Neurocomputing, 71(7-9):1257–1273. Chen, Y., Garcia, E., Gupta, M., Rahimi, A., and Cazzanti, L. (2009). Similarity-based classification: concepts and algorithm. Journal of Machine Learning Research, 10:747–776. Cottrell, M. and Letrémy, P. (2005). How to use the Kohonen algorithm to simultaneously analyse individuals in a survey. Neurocomputing, 63:193–207. Kohonen, T. (2001). Self-Organizing Maps, 3rd Edition, volume 30. Springer, Berlin, Heidelberg, New York. Mariette, J., Rossi, F., Olteanu, M., and Villa-Vialaneix, N. (2017). Accelerating stochastic kernel som. In Verleysen, M., editor, XXVth European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2017), pages 269–274, Bruges, Belgium. i6doc. Needleman, S. and Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3):443–453. Olteanu, M. and Villa-Vialaneix, N. (2015). On-line relational and multiple relational SOM. Neurocomputing, 147:15–30. Rossi, F. (2014). How many dissimilarity/kernel self organizing map variants do we need? Nathalie Vialaneix | Learning from (dis)similarity data 24/24
  • 45. In Villmann, T., Schleif, F., Kaden, M., and Lange, M., editors, Advances in Self-Organizing Maps and Learning Vector Quantization (Proceedings of WSOM 2014), volume 295 of Advances in Intelligent Systems and Computing, pages 3–23, Mittweida, Germany. Springer Verlag, Berlin, Heidelberg. Rossi, F., Villa-Vialaneix, N., and Hautefeuille, F. (2013). Exploration of a large database of French notarial acts with social network methods. Digital Medievalist, 9. Villa-Vialaneix, N. (2017). Stochastic self-organizing map variants with the R package SOMbrero. In Lamirel, J., Cottrell, M., and Olteanu, M., editors, 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (Proceedings of WSOM 2017), Nancy, France. IEEE. Nathalie Vialaneix | Learning from (dis)similarity data 24/24