SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
1
Graph Signal Processing: Handwritten Digits
Recognition Via Community Detection
Abstract—Graph signal processing is an emerging field of
research. When the structure of signals can be represented as
a graph, it allows to fully exploit their inherent structure. It has
been shown that the normalized graph Laplacian matrix plays
a major role in the characterization of a signal on a graph. In
this paper we are interested in using the spectrum of this matrix
to solve classical problems. More precisely, we aim to detect
communities in order to recognize image digits. Indeed, we use
the spectrum of the normalized graph Laplacian as a suitable
method to detect two communities in a graph. We show that this
method has better results than many algorithms of the state of
art. Then, we use the same spectrum to recognize handwritten
digit images. We compare the spectral clustering method with
some other classical algorithms, emphasizing the advantages of
spectral clustering in community detection and semi-supervised
classification applications.
Index Terms—Graph signal processing, Community detection,
Digit recognition, Normalized Graph Laplacian.
I. INTRODUCTION
During the recent years, the analysis and processing of
large-scale datasets using graphs has become very useful
[8]. In fact, many kinds of data domains such as social
and economic networks, electric grids, neuronal networks
and images databases require a graph representation of their
structure. Each of these structures usually carries out infor-
mation that flow between different elements of the network.
For example, in a neural network, a neuron is activated after
receiving an electric excitation, and the activation of a neuron
usually influences the nearby neurons. In the case of economic
networks, we can consider the economic crisis as a flow that
spreads from one bank to another. This need to represent these
phenomena has lead to the development of a new field: the
graph signal processing. Indeed, a continuous signal can be
sampled according to a specific frequency and the sampled
discrete signal that is obtained is usually carried out on a
graph [10]. By this way, we obtain at the same time a
representation of the structure of the network as well as of the
information flowing through it. For instance, a sound signal
can be represented on a linear or a ring graph. However, a
picture is usually represented on a grid graph where each
pixel is linked to its four or eigth nearest neighbors [8].
Weighted graphs are particularly used to represent the links
and similarities between the different elements of a network.
The advantage about signals on graphs is the fact that they can
be processed in a way analogous to the classical processing
[1].
One of the main applications of graph signal processing
today appear in the field of artificial intelligence and especially
in machine learning. Community detection and digit recogni-
tion are among the most known applications in this domain
[4]. For community detection many methods have been used
but the graph signal processing using the spectral clustering
method seems to be more efficient. Further more, for digit
recognition, there are many algorithms that are used nowadays
to classify handwritten digits such as the k-means algorithm
[5] but the graph signal processing can also be used for the
same purpose.
In this paper, we present a method based on the graph signal
processing and known as spectral clustering to resolve the
problem of community detection and provide a mathematical
proof of this method. The same method is then applied to
recognize handwritten digits from the MNIST data base. This
method takes a variant based on the smoothness properties of
signals defined on graphs .
The remainder of the paper is as follow. In the next
section, we provide some background from the graph signal
processing domain. In section III, we present the method of
spectral clustering applied to both community detection and
handwritten digits recognition problems. Then, we discuss our
results compared to the state of art in order to identify both
the advantages and the drawbacks of the spectral clustering
method. Section V concludes the paper.
II. GRAPH SIGNAL PROCESSING
Let us introduce notations first. We consider a weighted,
simple, undirected graph G = (E, V ) where E represents
the set of edges and V the set of vertices. Without a loss of
generality, we consider V to be the set of integers between 1
and N = |E|. We equip G with a N × N adjacency matrix
W defined as follows [9]:
Wi,j
The weight of the edge connecting i and j
0 if no such edge exists
(1)
When the edge weights are not naturally defined by an
application, one common way to define the weight of an edge
connecting vertices i and j is via a similarity function like a
distance :
Wi,j = dist(i, j) (2)
Where dist(i, j) may represent a physical distance between
two feature vectors describing the nodes i and j.
We also define the N × N diagonal degree matrix D as:
Di,i = di =
N
k=0
Wi,k (3)
For instance, a social network can be represented by a
weighted, simple, undirected graph, where the vertices are
the individuals and the edges represent the friendship bond
between two individuals. In this case, the degree matrix gives
2
us an idea about how important are the friendship links of
each individual.
We then introduce the non-normalized graph Laplacian L
D−W [9]. This matrix turns to have a major importance as it
stands for a differentiation operator for a signal over a graph.
We remind that a signal over a graph G is a vector x ∈
RN
where the ith
component of the vector x represents the
function value at the ith
vertex of V . The Laplacian’s ith
component of such a signal is the vector:
(Lx)(i) =
N
j=1
Wi,j[x(i) − x(j)] (4)
For example in the case of the social network, a signal can
represent a rumor: the individuals who received the rumor are
given the value 1 and those who did not are given the value
0. We obtain therefore a binary signal on graph.
When working with L2-norm, it makes sense to use instead
the normalized graph Laplacian, defined as [9]:
L = D− 1
2 · L · D− 1
2 (5)
Since the normalized (or standard) graph Laplacian is a
real valued symmetric matrix, it can be diagonalized using
an orthonormal basis. We denote a corresponding set of
orthonormal eigenvectors by {µl}l=1,2,...,N and the set of
associated real, non-negative eigenvalues by {λl}l=1,2,...,N
when those are ordered from the lowest eigenvalue to the
largest one.
In particular, we have:
Lµl = λlµl (6)
It is well-known that [6].:
0 = λ1 ≤ λ2 ≤ ... ≤ λN λmax ≤ 2 (7)
The literature gives many results binding eigenvalues with
properties of the graph. As an example, the number of con-
nected components of the graph is given by the multiplicity of
the eigenvalue zero. For instance, if the graph is connected, the
multiplicity of the eingenvalue zero is one. Also the highest
eigenvalue is equal to 2 if and only if the graph is bipartite.
The first eigenvector µ1 has a closed-form given by the
following formula [6]:
µ1(i) =
d(i)
u∈V d(u)
(8)
In the case of regular graphs, all the vertices have the same
degree so µ1 is a constant vector.
Eigenvectors of the graph normalized Laplacian extend
the principles of the Fourier transform for classical signal
processing. To understand this bindings, let us recall that the
classical Fourier transform of a signal f is given by:
˜f(ω) =
ˆ
R
f(x).e−iωx
dx (9)
Having:
d2
dx2
e−iωx
= −ω2
e−iωx
(10)
Figure 1. A positive graph signal defined on the Peterson graph. The height of
each blue bar represents the signal value at the vertex where the bar originates
[8].
Figure 2. Representation of the 16 cycle graph Laplacian eigenvectors. The
eigenvectors exhibit the sinusoidal characteristics of the Fourier Transform
basis. Signals defined on this graph are equivalent to classical descrete,
periodic signals.
We can notice that e−iωx
is the eigenvector of the Laplace
operator d2
dx2 associated with the eigenvalue −ω2
. On the other
hand, we have:
Lµl = λlµl (11)
so the frequencies in classical signal processing are analo-
gous to the eigenvalues of the normalized Laplacian in graph
signal processing. Consequently the Fourier transform ˆx of a
signal x on graph G is defined as [6]:
˜x(λl) =
N
i=1
x(i)µ∗
l (i) (12)
Where u∗
l represents the complexe conjugate of the eigen-
vector ul.
And the inverse graph Fourier transform is defined as [8]:
x(i) =
N
l=1
˜x(λl)µl(i) (13)
Finally, to characterize the smoothness of a signal on graph
G, one can use the Dirichlet form [6]:
S(x) =
xτ
Lx
x 2
=
1
x 2
N
l=1
λl(< x, µl >)2
(14)
The smaller is S(x), the smoother is the signal x.
3
III. SPECTRAL CLUSTERING
A. Community Detection
Detecting communities is a problem with many variants in
mathematics and computer science [4]. Over the years, several
methods have been developed for the data partitioning . One
of these methods is known as “spectral clustering” which uses
some properties of the normalized graph Laplacian. In this
section, we will be interested in detecting communities using
only the normalized graph Laplacian spectrum’s properties.
Considering a population of individuals that can be par-
tionned into two communities according to their properties,
our aim is to detect these two communities.
To achieve this, we consider a random graph characterized
by two probabilities p ∈ [0 1] and q ∈ [0 1], where p is the
probability to have a link between two individuals belonging
to the same community and q is the probability to have a
connection between two individuals belonging to different
communities. The vertices of this graph are the individuals,
the edges are the connections between them and the weigth
of the edges are p if it is a inter-community link or q if it
concerns a intra-community link.
We suppose that p ≥ q. If we consider the case where
(p, q) = (1, 0), the error rate of detecting the two communities
should tend to zero. The other limit case is when p = q where
the error rate should tend to 0.5.
If we consider G to be the graph representing the popula-
tion, and L its normalized graph Laplacian, the sign of L’s
second eigenvector components allows us to partition the set
of vertices into two different communities. Indeed, the L’s
second eigenvector components with the same sign belong to
the same community. We provide a proof of this principle:
The graph of the population is represented by a statistic
adjacency matrix. Instead of using a random graph with two
different probabilities p and q, we can consider a complete
simple graph where the weight of the edges linking two
items belonging to the same community is equal to p and the
weight of the edges linking two items belonging to different
communities is equal to q. This matrix is the statistical mean of
the adjacency matrices representing the Erdos Renyi random
graphs generated respectively with the probabilities p and q
[2].
The adjacency matrix A corresponding to this situation is
given in block form as:
A =
p · (JN∗ − IN ) q · JN∗
q · JN∗ p · (JN∗ − IN∗ )
(15)
where we denote by IN∗ the identity matrix of size N∗
,
JN∗ the N∗
× N∗
matrix containing one in each component
and N∗
the cardinality of each community. Then, the first N∗
vertices belong to a specific community and the others belong
to a different one.
The graph considered is regular, thus the degree matrix is
a scalar one given by:
D = ((N∗
− 1) · p + N∗
q) · IN∗ (16)
The graph Laplacian denoted by L is then given by:
L =
N∗
(p + q) · IN∗ − p · JN∗ q · JN∗
q · JN∗ N∗
· (p + q) · IN∗ − p · JN∗
(17)
Since we are interested in finding the second eigenvector of
L, and given that both L and L have the same eigenspaces,
we only need to have the second eigenvector of L.
To achieve this, we have to compute the characteristic
polynomial of L denoted by χL in order to calculate the eigen-
values and then their corresponding eigenspaces (in particular,
the eigenspace associated to the second eigenvalue).
We have:
χL(x) = det(L − x.IN ) =
M1 M2
M2 M1
(18)
where M1 = N∗
(p + q − x).IN∗ − pJN∗ and M2 = qJN∗
Since JN∗ and IN∗ commute, M1 and M2 commute, and
consequently we have:
χL(x) = det(M2
1 − M2
2 ) = det(M1 − M2) · det(M1 + M2)
(19)
We can easily verify that for p = q we have:
χL(x) = (p2
−q2
)N∗
·χJN∗ ·(
N∗
(p + q − x)
p − q
)·χJN∗ ·(
N∗
(p + q − x)
p + q
)
(20)
And knowing that:
χJN∗ (x) ∝ xN∗
−1
· (x − N∗
) (21)
we can prove that:
λ2 = 2 · N∗
· q (22)
.
Expressing explicitly ker(L − λ2.IN∗ ), we find that the
second eigenvector of L takes the form:
µ2 = [x1, x1, ..., x1
N∗
, x2, x2, ..., x2]
N∗
(23)
The graph is regular so µ1 is a constant vector and thanks
to the orthogonality of the eigenvector basis {µl}l=1,2,...,N we
have:
< µ1, µ2 >= 0 (24)
where we denote by < •, • > the classical dot product in
an euclidean space.
we conclude that:
x1 + x2 = 0 (25)
and then:
x1 = −x2. (26)
Thereby, the components of the second eigenvector having
the same sign represent vertices belonging to the same com-
munity.
4
Then, we compare the performance of this method with
two other algorithms from the state of art: “Reichardt” [7]
and “LFK” algorithms [3].
We recall that “Reichardt” method for community detection
uses a greedy optimization of a modularity function Q and the
aim is to compare the original network to a randomized one.
The weight of the links in the randomized network depends on
the probability of linking two nodes belonging to the original
network as follow [7]:
ai,j = wi,j − pi,j
bi,j = γRpi,j
(27)
where wi,j represents the weight of the edge linking the
nodes i and j in the original network, pi,j represents the
probability of linking two nodes in the original network, γR
is a parameter for optimization and finally ai,j and bi,j are the
weights of the links in the randomized network.
The second method for detecting communities is the “LFK”
method which consists in detecting a community for each node
of a network. To achieve this, we consider that a community is
a subgraph which maximizes the fitness function of its nodes.
The fitness function FSG of a subgraph is defined by [3]:
FSG =
KSG
in
(KSG
in + KSG
out)α
(28)
where SG is the community, Kin is twice the number of
the internal links in the community SG, Kout is the number
of the edges connecting SG with the rest of the graph and α
is a positive real-valued parameter, controlling the size of the
community.
We compare the performances of these two algorithms with
the spectral clustering method in figure 3. For that, we consider
random graphs with a fixed probability of linking two nodes
in the same community and we vary the probability of linking
two nodes belonging to different communities. The size of
these graphs is equal to 200 and the graphs are divided into
two equal communities.
B. Handwritten Digit Recognition
Hanwritten digit recognition is another application of graph
signal processing and one of the most classic examples in
the classification literature. It is an important task in semi-
supervised machine learning that allows us to identify a digit
based on a training database of digits [11]. In our case, we used
the MNIST database of handwritten digits which has a training
set of 60, 000 examples, and a test set of 10, 000 examples.
All images in the MNIST database are of size 28 × 28.
The task is to classify an unlabeled digit into one of a fixed
number of digit classes. Our first recognition test is aimed at
identifying a digit i amongst two possibilities of digits i1 and
i2 with the conditions :



i, i1, i2 ∈ [0..9]
i1 = i2
i ∈ {i1, i2}
(29)
0 0.05 0.1 0.15 0.2 0.25 0.3
0
0.2
0.4
0.6
0.8
q
ErrorRate
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.2
0.4
0.6
0.8
q
ErrorRate
Reichard
LFK
Spectral Clustering
Figure 3. Computing the performance of three community detection algo-
rithms using ErdosRenyi random graphs of size 200 with a fixed probability
of linking two individuals belonging to the same community (first case p=0.3,
second case p=0.9) and a variable probability q of linking two individuals
belonging to the different communities. We notice that the spectral clustering
is more accurate to detect the two communities with a lower error rate than
the two other methods. The tendency towards the maximum value of error
rate is lower in the case of the spectral clustering than in the case of Reichadt
and LFK methods.
−0.11 −0.105 −0.1 −0.095 −0.09 −0.085
−0.2
−0.1
0
0.1
0.2
First eigenVector
SecondeigenVector
−0.12 −0.115 −0.11 −0.105 −0.1 −0.095 −0.09 −0.085
−0.4
−0.2
0
0.2
0.4
First eigenVector
SecondeigenVector
first community
second community
Figure 4. Representation of the individuals belonging to two communities in
the basis (first eigenvector, second eigenvector)[10]. In the first case the prob-
ability of linking two individuals belonging to the same community is p=0.9
and the probability of linking two individuals from different communities is
q = 0.1. In the second case: we consider p=q=0.4.
We denote N1 the number of i1 images and N2 the number
of i2 images; both taken from the MNIST training database.
The graph is therefore composed of N1 + N2 + 1 vertices
corresponding to the training examples and to the digit i.
There are several possibilities to construct the graph of
which one option is to make a complete undirected graph.
To represent the link between two digits (2 vertices in the
graph), we need a metric. In our tests we used the standard
euclidean metric. The weight of an edge linking two images
5
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
First eigenVector
SecondeigenVector
2
5
Figure 5. Representation of the digits 2 and 5 in the normalized graph
Laplacian eigenvectors basis (first eigenvector, second eigenvector)[10]
I and I (matrices of size L × C) is 1
1+d2(I,I ) with d(I, I )
defined as the euclidean distance between I and I
d(I, I ) =
1
L × C
k,l
(Ikl − Ikl)2. (30)
Where Ikl denote the pixel’s value localted at the kth
row
and the lth
column of the picture I wich is considered as a
matrix of size L × C. In our case, we have: L = C = 28.
The graph carrying out the digits i1 and i2 can be shown
in the basis composed of the first and second normalized
graph Laplacian eigenvectors [10]. The first two eigenvectors
represent low frequencies of the signal and give therefore a
clear idea of the distance between the different vertices. For
instance, in the case of two classes of digits (2 and 5), the
associated graph is shown in figure 5.
As we can notice in figure 5, it is possible to identify two
communities although there are some vertices that belong to
different classes but that are very close to each other.
The second option to construct the graph carrying out the
digits i1 and i2 is to link each vertex only to its k-nearest
neighbors with a weight equal to one. k is a user defined
constant and the graph obtained is connected but not complete
for k < N1+N2. This method is known as the KNN algorithm
and it is interesting when handling very large size graphs
as it allows to have a sparse matrix and therefore a better
complexity.
Since there are two communities of digits, the problem is
similar to the one of community detection. Therefore the first
method that we use to identify the digit i is an algorithm
that produces the normalized graph Laplacian matrix which
is a good representation of the links between the digits in
the graph. As in the case of two communities discussed in the
previous section, we obtain two signs in the second eigenvector
and by comparing these signs with the sign of the component
corresponding to the digit i, it is possible to identify whether
i belongs to i1 or to i2 community.
The second method to classify the digit i is to compute the
smoothest signal on graph associated to the graph of digits
using the Dirichlet form [6]. The method consists of forming
a signal x of size N1 + N2 + 1 such as:
Algorithm k-means Laplacian method Smoothest signal
Error rate(%) 11.96 9.65 6.36
Table I
ERROR RATE FOR DIFFERENT CLASSIFICATION ALGORITHMS
The three algorithms are tested for the digits 2 and 5 where the cardinality
of each digit group is 100. 100 tests are realized for each algorithm.
x[j] =



1 for j ∈ [1 N1]
α for j = N1 + 1
−1 for j ∈ [N1 + 2, N1 + N2 + 1]
(31)
α ∈ [−1, 1] is an unknown parameter corresponding to the
signal’s component for the digit i. We aim at finding the value
α that makes the signal x the smoothest possible i.e. that
minimizes S(x), the Dirichlet form. The sign of α allows us
to identify i:
if α > 0 then i = i1
if α < 0 then i = i2
(32)
The results that we obtain with this algorithm show that we
get the smoothest signal particularly for the values {−1, 1} of
α; we have a minimum of S(x) with :
α = 1 if i = i1
α = −1 otherwise
(33)
We tested the two methods (Laplacian and the “smoothest
signal” methods) for the digits i1 = 2 and i2 = 5 and with
N1 = N2 = 100. For 100 tests we obtain an error rate
of 9.65% for the Laplacian method and of 6.36% for the
smoothest signal method. Table 1 shows the error rates of
the Laplacian method, of the smoothest signal method and of
k-means algorithm (k = 2) that are tested on the same MNIST
database.
Our second recognition test realize the recognition of one
digit i amongst l different possibilities {ik}k∈[2,l] with 2 ≤
l ≤ 10. For l = 3, we realized an algorithm that recognizes
the closest digit to i between two digits i1 and i2 which is the
same as the case of a classification with two classes. The digit
chosen from {i1, i2}as the closest to i is then compared with
i3 based on the same two-classes algorithm and the result
allows us to identify the closest digit to i from {i1, i2, i3}.
Since i ∈ {i1, i2, i3} this method allows us to recognize the
digit i. The same principle can be applied to the case where
l > 3.
The third and last test is to identify several digits si-
multaneously among l different possibilities {ik}k∈[2,l] with
2 ≤ l ≤ 10. For this case we started by realizing a complete
graph where the vertices include both the training and the test
set of digits. And for each digit to be classified, we use the
algorithm of the second recognition test, which realizes the
recognition of one digit amongst l different possibilities.
C. Results Analysis
In both applications that we have seen, a random graph
is generated. In the first case, we use an Erdos Reyni graph
6
Table II
EXECUTION TIME (IN SECONDS) FOR DIFFERENT ALGORITHMS IN
FUNCTION OF THE GRAPH SIZE.
Graph Size Reichardt algorithm LFK algorithm Spectral clustering method
20 0.002 0.0869 9.98 e-0.4
50 0.0026 0.1451 0.0021
100 0.0101 0.5483 0.0103
200 0.0582 2.5145 0.0540
[9] with a probability p of linking two vertices in the same
community and a probability q of linking two vertices from
different communities. The condition p > q is needed in order
to distinguish the two communities. The cardinality is the same
in both communities and the adjacency matrix is generated
in a block form, where the first block represents the first
community and the fourth block the second community.
For the second case, the digit images are chosen randomly
from the MNIST handwritten database. To construct the
random graph we use two methods. The first is based on
generating a complete graph where the weights are computed
using the euclidean distance. The second method involves
KNN algorithm and the graph generated is binary and not
complete.
In both these applications, we generate the normalized
Laplacian matrix and use the property of smoothness in low
frequency signals. In the case of Laplacian, we use the second
eigenvector since it is related to the second lowest eigenvalue.
In the first application, the detection of the two communities
using the Laplacian (spectral clustering) is more accurate than
some of the classical methods such as Reichardt or LFK’s
algorithms: the error rate in the case of spectral clustering is
lower and increases more slowly with q comparing to the other
two algorithms.
On the other hand, this method works essentially in order
to detect two comminities. In this case, it remains a very
efficient algorithm. The other two algorithms which were
tested (Reichardt and LFK) detect in some cases more than
two communities. Moreover, the spectral clustering algorithm
is faster than the two other algorithms (Reichardt and LFK).
Table II compares the execution times for the differents
algorithms tested.
IV. FUTURE WORKS
Our future works consist on partitioning a set of data in
more than two communities in order to generalize the principle
of the spectral clustering. For that, we are thinking about
applying the method presented in this paper hierarchically on
a data set. So, by applying our algorithm m times, we will
be able to recover 2m
communities. The interations’ number
of our algorithm will be choosen by optimizing a stability
criterion well defined.
V. CONCLUSION
This paper shows the importance of processing signals
on graph and the advantages of using the normalized graph
Laplacian in this processing. The low frequencies of the
Laplacian carry indeed interesting information about the struc-
ture of the graph it is representing. The use of a metric to
characterize the distance between the vertices allows us to
have a better idea of the link between the different vertices of
the graph. These properties of the Laplacian matrix are used
in two classical applications in machine learning literature:
community detection and handwritten digit recognition. The
spectral clustering allows us in the first case to detect two
unlabeled communities based only on the structure of the
graph and in the second case to classify one or many digits
into one or many labeled classes of training digits based on the
similarities between the training set of digits and the digits to
be classified. The spectral clustering allows us to have better
efficiency than some classical algorithms but remains limited
by the fact that it can detect only two communities at once.
Therefore, the study of the normalized graph Laplacian
spectrum provides us with solutions to some frequent applica-
tions. There are many other use cases that can be treated using
the graph Laplacian method and that need to be considered in
further studies.
VI. ACKNOWLEDGEMENTS
The authors would like to thank Vincent Gripon, associate
professor at Telecom Bretagne for giving us the opportunity
to work on the domain of the graph signal processing and
for helping us to improve our work thanks to his constructive
comments.
REFERENCES
[1] Ameya Agaskar and Yue M Lu. A spectral graph uncertainty principle.
Information Theory, IEEE Transactions on, 59(7):4338–4356, 2013.
[2] P Erd˝os and Alfréd Rényi. On the existence of a factor of degree one of
a connected random graph. Acta Mathematica Hungarica, 17(3-4):359–
368, 1966.
[3] Andrea Lancichinetti, Santo Fortunato, and János Kertész. Detecting the
overlapping and hierarchical community structure in complex networks.
New Journal of Physics, 11(3):033015, 2009.
[4] Erwan Le Martelot and Chris Hankin. Fast multi-scale community
detection based on local criteria within a multi-threaded algorithm. arXiv
preprint arXiv:1301.0955, 2013.
[5] Mohammad Norouzi and David J Fleet. Cartesian k-means. In Computer
Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on,
pages 3017–3024. IEEE, 2013.
[6] Michael G Rabbat and Vincent Gripon. Towards a spectral characteriza-
tion of signals supported on small-world networks. In Acoustics, Speech
and Signal Processing (ICASSP), 2014 IEEE International Conference
on, pages 4793–4797. IEEE, 2014.
[7] Jörg Reichardt and Stefan Bornholdt. Statistical mechanics of commu-
nity detection. Physical Review E, 74(1):016110, 2006.
[8] David Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, Pierre
Vandergheynst, et al. The emerging field of signal processing on graphs:
Extending high-dimensional data analysis to networks and other irregular
domains. Signal Processing Magazine, IEEE, 30(3):83–98, 2013.
[9] David I Shuman, Pierre Vandergheynst, and Pascal Frossard. Distributed
signal processing via chebyshev polynomial approximation. arXiv
preprint arXiv:1111.5239, 2011.
[10] Daniel A Spielman. Spectral graph theory and its applications. In null,
pages 29–38. IEEE, 2007.
[11] M Van Breukelen, Robert PW Duin, David MJ Tax, and JE Den Hartog.
Handwritten digit recognition by combined classifiers. Kybernetika,
34(4):381–386, 1998.

Weitere ähnliche Inhalte

Was ist angesagt?

Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
IJRAT
 
4 image segmentation through clustering
4 image segmentation through clustering4 image segmentation through clustering
4 image segmentation through clustering
IAEME Publication
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
Qujiang Lei
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learning
butest
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
Sean Golliher
 

Was ist angesagt? (19)

Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...
 
Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesTraining and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian Processes
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
 
4 image segmentation through clustering
4 image segmentation through clustering4 image segmentation through clustering
4 image segmentation through clustering
 
regions
regionsregions
regions
 
Graph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer NetworksGraph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer Networks
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
 
PERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHM
PERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHMPERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHM
PERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHM
 
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco ScutariBayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
Bayes Nets Meetup Sept 29th 2016 - Bayesian Network Modelling by Marco Scutari
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learning
 
Extract the ancient letters from decorated
Extract the ancient letters from decoratedExtract the ancient letters from decorated
Extract the ancient letters from decorated
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
 
06 Vector Visualization
06 Vector Visualization06 Vector Visualization
06 Vector Visualization
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
 
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERINGA COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
A COMPARATIVE STUDY ON DISTANCE MEASURING APPROACHES FOR CLUSTERING
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482
 
Comparison of Various RCNN techniques for Classification of Object from Image
Comparison of Various RCNN techniques for Classification of Object from ImageComparison of Various RCNN techniques for Classification of Object from Image
Comparison of Various RCNN techniques for Classification of Object from Image
 

Ähnlich wie GraphSignalProcessingFinalPaper

Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
ArchiLab 7
 
On algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetryOn algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetry
graphhoc
 
Image Processing
Image ProcessingImage Processing
Image Processing
Tuyen Pham
 
Distributed coloring with O(sqrt. log n) bits
Distributed coloring with O(sqrt. log n) bitsDistributed coloring with O(sqrt. log n) bits
Distributed coloring with O(sqrt. log n) bits
Subhajit Sahu
 
20070823
2007082320070823
20070823
neostar
 

Ähnlich wie GraphSignalProcessingFinalPaper (20)

Colloquium.pptx
Colloquium.pptxColloquium.pptx
Colloquium.pptx
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
 
Learning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RLLearning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RL
 
Line
LineLine
Line
 
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...Drobics, m. 2001:  datamining using synergiesbetween self-organising maps and...
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
 
On algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetryOn algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetry
 
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
 
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTIONMODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
MODIFIED LLL ALGORITHM WITH SHIFTED START COLUMN FOR COMPLEXITY REDUCTION
 
Linear algebra havard university
Linear algebra havard universityLinear algebra havard university
Linear algebra havard university
 
Traveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed EnvironmentTraveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed Environment
 
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENTTRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
 
Graceful labelings
Graceful labelingsGraceful labelings
Graceful labelings
 
Image Processing
Image ProcessingImage Processing
Image Processing
 
Distributed coloring with O(sqrt. log n) bits
Distributed coloring with O(sqrt. log n) bitsDistributed coloring with O(sqrt. log n) bits
Distributed coloring with O(sqrt. log n) bits
 
Quantum persistent k cores for community detection
Quantum persistent k cores for community detectionQuantum persistent k cores for community detection
Quantum persistent k cores for community detection
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSEVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
 
240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx
240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx
240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx
 
Color Image Watermarking Application for ERTU Cloud
Color Image Watermarking Application for ERTU CloudColor Image Watermarking Application for ERTU Cloud
Color Image Watermarking Application for ERTU Cloud
 
Using spectral radius ratio for node degree
Using spectral radius ratio for node degreeUsing spectral radius ratio for node degree
Using spectral radius ratio for node degree
 
20070823
2007082320070823
20070823
 

GraphSignalProcessingFinalPaper

  • 1. 1 Graph Signal Processing: Handwritten Digits Recognition Via Community Detection Abstract—Graph signal processing is an emerging field of research. When the structure of signals can be represented as a graph, it allows to fully exploit their inherent structure. It has been shown that the normalized graph Laplacian matrix plays a major role in the characterization of a signal on a graph. In this paper we are interested in using the spectrum of this matrix to solve classical problems. More precisely, we aim to detect communities in order to recognize image digits. Indeed, we use the spectrum of the normalized graph Laplacian as a suitable method to detect two communities in a graph. We show that this method has better results than many algorithms of the state of art. Then, we use the same spectrum to recognize handwritten digit images. We compare the spectral clustering method with some other classical algorithms, emphasizing the advantages of spectral clustering in community detection and semi-supervised classification applications. Index Terms—Graph signal processing, Community detection, Digit recognition, Normalized Graph Laplacian. I. INTRODUCTION During the recent years, the analysis and processing of large-scale datasets using graphs has become very useful [8]. In fact, many kinds of data domains such as social and economic networks, electric grids, neuronal networks and images databases require a graph representation of their structure. Each of these structures usually carries out infor- mation that flow between different elements of the network. For example, in a neural network, a neuron is activated after receiving an electric excitation, and the activation of a neuron usually influences the nearby neurons. In the case of economic networks, we can consider the economic crisis as a flow that spreads from one bank to another. This need to represent these phenomena has lead to the development of a new field: the graph signal processing. Indeed, a continuous signal can be sampled according to a specific frequency and the sampled discrete signal that is obtained is usually carried out on a graph [10]. By this way, we obtain at the same time a representation of the structure of the network as well as of the information flowing through it. For instance, a sound signal can be represented on a linear or a ring graph. However, a picture is usually represented on a grid graph where each pixel is linked to its four or eigth nearest neighbors [8]. Weighted graphs are particularly used to represent the links and similarities between the different elements of a network. The advantage about signals on graphs is the fact that they can be processed in a way analogous to the classical processing [1]. One of the main applications of graph signal processing today appear in the field of artificial intelligence and especially in machine learning. Community detection and digit recogni- tion are among the most known applications in this domain [4]. For community detection many methods have been used but the graph signal processing using the spectral clustering method seems to be more efficient. Further more, for digit recognition, there are many algorithms that are used nowadays to classify handwritten digits such as the k-means algorithm [5] but the graph signal processing can also be used for the same purpose. In this paper, we present a method based on the graph signal processing and known as spectral clustering to resolve the problem of community detection and provide a mathematical proof of this method. The same method is then applied to recognize handwritten digits from the MNIST data base. This method takes a variant based on the smoothness properties of signals defined on graphs . The remainder of the paper is as follow. In the next section, we provide some background from the graph signal processing domain. In section III, we present the method of spectral clustering applied to both community detection and handwritten digits recognition problems. Then, we discuss our results compared to the state of art in order to identify both the advantages and the drawbacks of the spectral clustering method. Section V concludes the paper. II. GRAPH SIGNAL PROCESSING Let us introduce notations first. We consider a weighted, simple, undirected graph G = (E, V ) where E represents the set of edges and V the set of vertices. Without a loss of generality, we consider V to be the set of integers between 1 and N = |E|. We equip G with a N × N adjacency matrix W defined as follows [9]: Wi,j The weight of the edge connecting i and j 0 if no such edge exists (1) When the edge weights are not naturally defined by an application, one common way to define the weight of an edge connecting vertices i and j is via a similarity function like a distance : Wi,j = dist(i, j) (2) Where dist(i, j) may represent a physical distance between two feature vectors describing the nodes i and j. We also define the N × N diagonal degree matrix D as: Di,i = di = N k=0 Wi,k (3) For instance, a social network can be represented by a weighted, simple, undirected graph, where the vertices are the individuals and the edges represent the friendship bond between two individuals. In this case, the degree matrix gives
  • 2. 2 us an idea about how important are the friendship links of each individual. We then introduce the non-normalized graph Laplacian L D−W [9]. This matrix turns to have a major importance as it stands for a differentiation operator for a signal over a graph. We remind that a signal over a graph G is a vector x ∈ RN where the ith component of the vector x represents the function value at the ith vertex of V . The Laplacian’s ith component of such a signal is the vector: (Lx)(i) = N j=1 Wi,j[x(i) − x(j)] (4) For example in the case of the social network, a signal can represent a rumor: the individuals who received the rumor are given the value 1 and those who did not are given the value 0. We obtain therefore a binary signal on graph. When working with L2-norm, it makes sense to use instead the normalized graph Laplacian, defined as [9]: L = D− 1 2 · L · D− 1 2 (5) Since the normalized (or standard) graph Laplacian is a real valued symmetric matrix, it can be diagonalized using an orthonormal basis. We denote a corresponding set of orthonormal eigenvectors by {µl}l=1,2,...,N and the set of associated real, non-negative eigenvalues by {λl}l=1,2,...,N when those are ordered from the lowest eigenvalue to the largest one. In particular, we have: Lµl = λlµl (6) It is well-known that [6].: 0 = λ1 ≤ λ2 ≤ ... ≤ λN λmax ≤ 2 (7) The literature gives many results binding eigenvalues with properties of the graph. As an example, the number of con- nected components of the graph is given by the multiplicity of the eigenvalue zero. For instance, if the graph is connected, the multiplicity of the eingenvalue zero is one. Also the highest eigenvalue is equal to 2 if and only if the graph is bipartite. The first eigenvector µ1 has a closed-form given by the following formula [6]: µ1(i) = d(i) u∈V d(u) (8) In the case of regular graphs, all the vertices have the same degree so µ1 is a constant vector. Eigenvectors of the graph normalized Laplacian extend the principles of the Fourier transform for classical signal processing. To understand this bindings, let us recall that the classical Fourier transform of a signal f is given by: ˜f(ω) = ˆ R f(x).e−iωx dx (9) Having: d2 dx2 e−iωx = −ω2 e−iωx (10) Figure 1. A positive graph signal defined on the Peterson graph. The height of each blue bar represents the signal value at the vertex where the bar originates [8]. Figure 2. Representation of the 16 cycle graph Laplacian eigenvectors. The eigenvectors exhibit the sinusoidal characteristics of the Fourier Transform basis. Signals defined on this graph are equivalent to classical descrete, periodic signals. We can notice that e−iωx is the eigenvector of the Laplace operator d2 dx2 associated with the eigenvalue −ω2 . On the other hand, we have: Lµl = λlµl (11) so the frequencies in classical signal processing are analo- gous to the eigenvalues of the normalized Laplacian in graph signal processing. Consequently the Fourier transform ˆx of a signal x on graph G is defined as [6]: ˜x(λl) = N i=1 x(i)µ∗ l (i) (12) Where u∗ l represents the complexe conjugate of the eigen- vector ul. And the inverse graph Fourier transform is defined as [8]: x(i) = N l=1 ˜x(λl)µl(i) (13) Finally, to characterize the smoothness of a signal on graph G, one can use the Dirichlet form [6]: S(x) = xτ Lx x 2 = 1 x 2 N l=1 λl(< x, µl >)2 (14) The smaller is S(x), the smoother is the signal x.
  • 3. 3 III. SPECTRAL CLUSTERING A. Community Detection Detecting communities is a problem with many variants in mathematics and computer science [4]. Over the years, several methods have been developed for the data partitioning . One of these methods is known as “spectral clustering” which uses some properties of the normalized graph Laplacian. In this section, we will be interested in detecting communities using only the normalized graph Laplacian spectrum’s properties. Considering a population of individuals that can be par- tionned into two communities according to their properties, our aim is to detect these two communities. To achieve this, we consider a random graph characterized by two probabilities p ∈ [0 1] and q ∈ [0 1], where p is the probability to have a link between two individuals belonging to the same community and q is the probability to have a connection between two individuals belonging to different communities. The vertices of this graph are the individuals, the edges are the connections between them and the weigth of the edges are p if it is a inter-community link or q if it concerns a intra-community link. We suppose that p ≥ q. If we consider the case where (p, q) = (1, 0), the error rate of detecting the two communities should tend to zero. The other limit case is when p = q where the error rate should tend to 0.5. If we consider G to be the graph representing the popula- tion, and L its normalized graph Laplacian, the sign of L’s second eigenvector components allows us to partition the set of vertices into two different communities. Indeed, the L’s second eigenvector components with the same sign belong to the same community. We provide a proof of this principle: The graph of the population is represented by a statistic adjacency matrix. Instead of using a random graph with two different probabilities p and q, we can consider a complete simple graph where the weight of the edges linking two items belonging to the same community is equal to p and the weight of the edges linking two items belonging to different communities is equal to q. This matrix is the statistical mean of the adjacency matrices representing the Erdos Renyi random graphs generated respectively with the probabilities p and q [2]. The adjacency matrix A corresponding to this situation is given in block form as: A = p · (JN∗ − IN ) q · JN∗ q · JN∗ p · (JN∗ − IN∗ ) (15) where we denote by IN∗ the identity matrix of size N∗ , JN∗ the N∗ × N∗ matrix containing one in each component and N∗ the cardinality of each community. Then, the first N∗ vertices belong to a specific community and the others belong to a different one. The graph considered is regular, thus the degree matrix is a scalar one given by: D = ((N∗ − 1) · p + N∗ q) · IN∗ (16) The graph Laplacian denoted by L is then given by: L = N∗ (p + q) · IN∗ − p · JN∗ q · JN∗ q · JN∗ N∗ · (p + q) · IN∗ − p · JN∗ (17) Since we are interested in finding the second eigenvector of L, and given that both L and L have the same eigenspaces, we only need to have the second eigenvector of L. To achieve this, we have to compute the characteristic polynomial of L denoted by χL in order to calculate the eigen- values and then their corresponding eigenspaces (in particular, the eigenspace associated to the second eigenvalue). We have: χL(x) = det(L − x.IN ) = M1 M2 M2 M1 (18) where M1 = N∗ (p + q − x).IN∗ − pJN∗ and M2 = qJN∗ Since JN∗ and IN∗ commute, M1 and M2 commute, and consequently we have: χL(x) = det(M2 1 − M2 2 ) = det(M1 − M2) · det(M1 + M2) (19) We can easily verify that for p = q we have: χL(x) = (p2 −q2 )N∗ ·χJN∗ ·( N∗ (p + q − x) p − q )·χJN∗ ·( N∗ (p + q − x) p + q ) (20) And knowing that: χJN∗ (x) ∝ xN∗ −1 · (x − N∗ ) (21) we can prove that: λ2 = 2 · N∗ · q (22) . Expressing explicitly ker(L − λ2.IN∗ ), we find that the second eigenvector of L takes the form: µ2 = [x1, x1, ..., x1 N∗ , x2, x2, ..., x2] N∗ (23) The graph is regular so µ1 is a constant vector and thanks to the orthogonality of the eigenvector basis {µl}l=1,2,...,N we have: < µ1, µ2 >= 0 (24) where we denote by < •, • > the classical dot product in an euclidean space. we conclude that: x1 + x2 = 0 (25) and then: x1 = −x2. (26) Thereby, the components of the second eigenvector having the same sign represent vertices belonging to the same com- munity.
  • 4. 4 Then, we compare the performance of this method with two other algorithms from the state of art: “Reichardt” [7] and “LFK” algorithms [3]. We recall that “Reichardt” method for community detection uses a greedy optimization of a modularity function Q and the aim is to compare the original network to a randomized one. The weight of the links in the randomized network depends on the probability of linking two nodes belonging to the original network as follow [7]: ai,j = wi,j − pi,j bi,j = γRpi,j (27) where wi,j represents the weight of the edge linking the nodes i and j in the original network, pi,j represents the probability of linking two nodes in the original network, γR is a parameter for optimization and finally ai,j and bi,j are the weights of the links in the randomized network. The second method for detecting communities is the “LFK” method which consists in detecting a community for each node of a network. To achieve this, we consider that a community is a subgraph which maximizes the fitness function of its nodes. The fitness function FSG of a subgraph is defined by [3]: FSG = KSG in (KSG in + KSG out)α (28) where SG is the community, Kin is twice the number of the internal links in the community SG, Kout is the number of the edges connecting SG with the rest of the graph and α is a positive real-valued parameter, controlling the size of the community. We compare the performances of these two algorithms with the spectral clustering method in figure 3. For that, we consider random graphs with a fixed probability of linking two nodes in the same community and we vary the probability of linking two nodes belonging to different communities. The size of these graphs is equal to 200 and the graphs are divided into two equal communities. B. Handwritten Digit Recognition Hanwritten digit recognition is another application of graph signal processing and one of the most classic examples in the classification literature. It is an important task in semi- supervised machine learning that allows us to identify a digit based on a training database of digits [11]. In our case, we used the MNIST database of handwritten digits which has a training set of 60, 000 examples, and a test set of 10, 000 examples. All images in the MNIST database are of size 28 × 28. The task is to classify an unlabeled digit into one of a fixed number of digit classes. Our first recognition test is aimed at identifying a digit i amongst two possibilities of digits i1 and i2 with the conditions :    i, i1, i2 ∈ [0..9] i1 = i2 i ∈ {i1, i2} (29) 0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.2 0.4 0.6 0.8 q ErrorRate 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.2 0.4 0.6 0.8 q ErrorRate Reichard LFK Spectral Clustering Figure 3. Computing the performance of three community detection algo- rithms using ErdosRenyi random graphs of size 200 with a fixed probability of linking two individuals belonging to the same community (first case p=0.3, second case p=0.9) and a variable probability q of linking two individuals belonging to the different communities. We notice that the spectral clustering is more accurate to detect the two communities with a lower error rate than the two other methods. The tendency towards the maximum value of error rate is lower in the case of the spectral clustering than in the case of Reichadt and LFK methods. −0.11 −0.105 −0.1 −0.095 −0.09 −0.085 −0.2 −0.1 0 0.1 0.2 First eigenVector SecondeigenVector −0.12 −0.115 −0.11 −0.105 −0.1 −0.095 −0.09 −0.085 −0.4 −0.2 0 0.2 0.4 First eigenVector SecondeigenVector first community second community Figure 4. Representation of the individuals belonging to two communities in the basis (first eigenvector, second eigenvector)[10]. In the first case the prob- ability of linking two individuals belonging to the same community is p=0.9 and the probability of linking two individuals from different communities is q = 0.1. In the second case: we consider p=q=0.4. We denote N1 the number of i1 images and N2 the number of i2 images; both taken from the MNIST training database. The graph is therefore composed of N1 + N2 + 1 vertices corresponding to the training examples and to the digit i. There are several possibilities to construct the graph of which one option is to make a complete undirected graph. To represent the link between two digits (2 vertices in the graph), we need a metric. In our tests we used the standard euclidean metric. The weight of an edge linking two images
  • 5. 5 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 First eigenVector SecondeigenVector 2 5 Figure 5. Representation of the digits 2 and 5 in the normalized graph Laplacian eigenvectors basis (first eigenvector, second eigenvector)[10] I and I (matrices of size L × C) is 1 1+d2(I,I ) with d(I, I ) defined as the euclidean distance between I and I d(I, I ) = 1 L × C k,l (Ikl − Ikl)2. (30) Where Ikl denote the pixel’s value localted at the kth row and the lth column of the picture I wich is considered as a matrix of size L × C. In our case, we have: L = C = 28. The graph carrying out the digits i1 and i2 can be shown in the basis composed of the first and second normalized graph Laplacian eigenvectors [10]. The first two eigenvectors represent low frequencies of the signal and give therefore a clear idea of the distance between the different vertices. For instance, in the case of two classes of digits (2 and 5), the associated graph is shown in figure 5. As we can notice in figure 5, it is possible to identify two communities although there are some vertices that belong to different classes but that are very close to each other. The second option to construct the graph carrying out the digits i1 and i2 is to link each vertex only to its k-nearest neighbors with a weight equal to one. k is a user defined constant and the graph obtained is connected but not complete for k < N1+N2. This method is known as the KNN algorithm and it is interesting when handling very large size graphs as it allows to have a sparse matrix and therefore a better complexity. Since there are two communities of digits, the problem is similar to the one of community detection. Therefore the first method that we use to identify the digit i is an algorithm that produces the normalized graph Laplacian matrix which is a good representation of the links between the digits in the graph. As in the case of two communities discussed in the previous section, we obtain two signs in the second eigenvector and by comparing these signs with the sign of the component corresponding to the digit i, it is possible to identify whether i belongs to i1 or to i2 community. The second method to classify the digit i is to compute the smoothest signal on graph associated to the graph of digits using the Dirichlet form [6]. The method consists of forming a signal x of size N1 + N2 + 1 such as: Algorithm k-means Laplacian method Smoothest signal Error rate(%) 11.96 9.65 6.36 Table I ERROR RATE FOR DIFFERENT CLASSIFICATION ALGORITHMS The three algorithms are tested for the digits 2 and 5 where the cardinality of each digit group is 100. 100 tests are realized for each algorithm. x[j] =    1 for j ∈ [1 N1] α for j = N1 + 1 −1 for j ∈ [N1 + 2, N1 + N2 + 1] (31) α ∈ [−1, 1] is an unknown parameter corresponding to the signal’s component for the digit i. We aim at finding the value α that makes the signal x the smoothest possible i.e. that minimizes S(x), the Dirichlet form. The sign of α allows us to identify i: if α > 0 then i = i1 if α < 0 then i = i2 (32) The results that we obtain with this algorithm show that we get the smoothest signal particularly for the values {−1, 1} of α; we have a minimum of S(x) with : α = 1 if i = i1 α = −1 otherwise (33) We tested the two methods (Laplacian and the “smoothest signal” methods) for the digits i1 = 2 and i2 = 5 and with N1 = N2 = 100. For 100 tests we obtain an error rate of 9.65% for the Laplacian method and of 6.36% for the smoothest signal method. Table 1 shows the error rates of the Laplacian method, of the smoothest signal method and of k-means algorithm (k = 2) that are tested on the same MNIST database. Our second recognition test realize the recognition of one digit i amongst l different possibilities {ik}k∈[2,l] with 2 ≤ l ≤ 10. For l = 3, we realized an algorithm that recognizes the closest digit to i between two digits i1 and i2 which is the same as the case of a classification with two classes. The digit chosen from {i1, i2}as the closest to i is then compared with i3 based on the same two-classes algorithm and the result allows us to identify the closest digit to i from {i1, i2, i3}. Since i ∈ {i1, i2, i3} this method allows us to recognize the digit i. The same principle can be applied to the case where l > 3. The third and last test is to identify several digits si- multaneously among l different possibilities {ik}k∈[2,l] with 2 ≤ l ≤ 10. For this case we started by realizing a complete graph where the vertices include both the training and the test set of digits. And for each digit to be classified, we use the algorithm of the second recognition test, which realizes the recognition of one digit amongst l different possibilities. C. Results Analysis In both applications that we have seen, a random graph is generated. In the first case, we use an Erdos Reyni graph
  • 6. 6 Table II EXECUTION TIME (IN SECONDS) FOR DIFFERENT ALGORITHMS IN FUNCTION OF THE GRAPH SIZE. Graph Size Reichardt algorithm LFK algorithm Spectral clustering method 20 0.002 0.0869 9.98 e-0.4 50 0.0026 0.1451 0.0021 100 0.0101 0.5483 0.0103 200 0.0582 2.5145 0.0540 [9] with a probability p of linking two vertices in the same community and a probability q of linking two vertices from different communities. The condition p > q is needed in order to distinguish the two communities. The cardinality is the same in both communities and the adjacency matrix is generated in a block form, where the first block represents the first community and the fourth block the second community. For the second case, the digit images are chosen randomly from the MNIST handwritten database. To construct the random graph we use two methods. The first is based on generating a complete graph where the weights are computed using the euclidean distance. The second method involves KNN algorithm and the graph generated is binary and not complete. In both these applications, we generate the normalized Laplacian matrix and use the property of smoothness in low frequency signals. In the case of Laplacian, we use the second eigenvector since it is related to the second lowest eigenvalue. In the first application, the detection of the two communities using the Laplacian (spectral clustering) is more accurate than some of the classical methods such as Reichardt or LFK’s algorithms: the error rate in the case of spectral clustering is lower and increases more slowly with q comparing to the other two algorithms. On the other hand, this method works essentially in order to detect two comminities. In this case, it remains a very efficient algorithm. The other two algorithms which were tested (Reichardt and LFK) detect in some cases more than two communities. Moreover, the spectral clustering algorithm is faster than the two other algorithms (Reichardt and LFK). Table II compares the execution times for the differents algorithms tested. IV. FUTURE WORKS Our future works consist on partitioning a set of data in more than two communities in order to generalize the principle of the spectral clustering. For that, we are thinking about applying the method presented in this paper hierarchically on a data set. So, by applying our algorithm m times, we will be able to recover 2m communities. The interations’ number of our algorithm will be choosen by optimizing a stability criterion well defined. V. CONCLUSION This paper shows the importance of processing signals on graph and the advantages of using the normalized graph Laplacian in this processing. The low frequencies of the Laplacian carry indeed interesting information about the struc- ture of the graph it is representing. The use of a metric to characterize the distance between the vertices allows us to have a better idea of the link between the different vertices of the graph. These properties of the Laplacian matrix are used in two classical applications in machine learning literature: community detection and handwritten digit recognition. The spectral clustering allows us in the first case to detect two unlabeled communities based only on the structure of the graph and in the second case to classify one or many digits into one or many labeled classes of training digits based on the similarities between the training set of digits and the digits to be classified. The spectral clustering allows us to have better efficiency than some classical algorithms but remains limited by the fact that it can detect only two communities at once. Therefore, the study of the normalized graph Laplacian spectrum provides us with solutions to some frequent applica- tions. There are many other use cases that can be treated using the graph Laplacian method and that need to be considered in further studies. VI. ACKNOWLEDGEMENTS The authors would like to thank Vincent Gripon, associate professor at Telecom Bretagne for giving us the opportunity to work on the domain of the graph signal processing and for helping us to improve our work thanks to his constructive comments. REFERENCES [1] Ameya Agaskar and Yue M Lu. A spectral graph uncertainty principle. Information Theory, IEEE Transactions on, 59(7):4338–4356, 2013. [2] P Erd˝os and Alfréd Rényi. On the existence of a factor of degree one of a connected random graph. Acta Mathematica Hungarica, 17(3-4):359– 368, 1966. [3] Andrea Lancichinetti, Santo Fortunato, and János Kertész. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 11(3):033015, 2009. [4] Erwan Le Martelot and Chris Hankin. Fast multi-scale community detection based on local criteria within a multi-threaded algorithm. arXiv preprint arXiv:1301.0955, 2013. [5] Mohammad Norouzi and David J Fleet. Cartesian k-means. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3017–3024. IEEE, 2013. [6] Michael G Rabbat and Vincent Gripon. Towards a spectral characteriza- tion of signals supported on small-world networks. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 4793–4797. IEEE, 2014. [7] Jörg Reichardt and Stefan Bornholdt. Statistical mechanics of commu- nity detection. Physical Review E, 74(1):016110, 2006. [8] David Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, Pierre Vandergheynst, et al. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. Signal Processing Magazine, IEEE, 30(3):83–98, 2013. [9] David I Shuman, Pierre Vandergheynst, and Pascal Frossard. Distributed signal processing via chebyshev polynomial approximation. arXiv preprint arXiv:1111.5239, 2011. [10] Daniel A Spielman. Spectral graph theory and its applications. In null, pages 29–38. IEEE, 2007. [11] M Van Breukelen, Robert PW Duin, David MJ Tax, and JE Den Hartog. Handwritten digit recognition by combined classifiers. Kybernetika, 34(4):381–386, 1998.