1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 3 (Mar. - Apr. 2013), PP 16-21
www.iosrjournals.org
A hybrid algorithm based on KFCM-HACO-FAPSO for
clustering ECG beat
J.Mercy Geraldine1, P.Kiruthiga2
1
Head of the Department, Department of CSE, Srinivasan Engineering College, Perambalur, Tamilnadu, India
2
M.E-Student, Department of CSE, Srinivasan Engineering College, Perambalur, Tamilnadu, India
Abstract: Data clustering is an essential technique for web applications and organizations. However, the
clustering performance has to be optimized to form usable and efficient data clusters. Many optimizing methods
have been suggested to improve the clustering performance of the fuzzy c- means clustering. The FAPSO and
the HACO optimization techniques have been proposed to improve the clustering performance. However, these
conventional methods experience from various restrictions such as trapping into local minima and lack of prior
knowledge for optimum parameters of the kernel functions. Considering the performance of the clustering
techniques, the kernel methods are used in kernelized fuzzy c-means algorithm for improving the clustering
performance of the well know fuzzy c-means algorithm. This is obtained by mapping the given dataset into a
higher dimensional space non-linearly. Where, the newly obtained dataset from the database are linearly
separable. The dataset which is extracted from MIT–BIH arrhythmia database are applied in the proposed
method and domain features are extracted for each type and training and test sets are formed. This algorithm
can be used in various applications such as web application, classifying ECG records.
Keywords - Fuzzy c-means, Kernelized fuzzy c-means, Ant colony optimization, particle swarm optimization.
I. Introduction
Data clustering describes the process of growing data into classes or cluster such that the data in each
cluster share a high degree of similarity while being very dissimilar to data from other cluster. The ACO, PSO
algorithms is one of the modern evolutionary algorithms. Thus, the pattern recognition methods are useful in
mining the dataset. Pattern recognition methods can be categorized into two groups according to the learning
procedure. Supervised learning requires prior labeling of the training data to create a model of the given dataset.
A supervised learning algorithm analyses the given training dataset and creates an output. This output is then
compared with the desired output (label) and an error or feedback signal is created. Algorithm then updates
itself according to this feedback signal in order to create a model of the given dataset. Once the algorithm is
terminated the obtained model should generalize the training data such that when an unknown input pattern is
given to the model it should be classified correctly. However, unsupervised learning does not need a prior
labeling. It creates clusters from a given dataset according to a similarity measure which is usually a distance
function. After the clustering process, similar patterns are grouped in the same cluster and dissimilar patterns
are grouped in different clusters.
ECG classification algorithms, a similarity measure is used to measure the distances between the query
beat and the templates in the database. The smaller the distance is, the more similar the template to the query.
However, as stated above, in many cases a data point can be in a location which is almost equally distant to
more than one cluster center. In such a situation, fuzzy clustering methods could prevent the misclassification of
a query beat by utilizing the membership values of the data points to each cluster. Based on the above
considerations, recently a number of studies which use fuzzy clustering algorithms for ECG beat classification
are proposed. One of these fuzzy clustering algorithms is the fuzzy c-means (FCM) algorithm. After a clustering
process, the FCM algorithm gives two outputs namely, the cluster centers and a fuzzy partition matrix which
contains the membership values of each data point to these clusters. The obtained cluster centers or the
membership values are then utilized for classification.
The popular clustering algorithm is the hard c-means algorithm which assigns a data point in a given
dataset to exactly one cluster. Such an assignment can be inadequate because some data points can be in a
location which is almost equally distant from two or more cluster centers. By forcing such a point to exactly one
cluster, the similarity of this point to other clusters is totally ignored. For this reason, fuzzy clustering methods
are proposed. In fuzzy clustering methods a data point can belong to more than one cluster with different
degrees of membership which is useful especially when the clusters overlap each other.
Fuzzy c-means algorithm is used in several clustering problems efficiently and it has two major
drawbacks which reduce its performance. The needs for an optimization and thus these algorithm are been
proposed. Firstly, it is responsive to initial values of the clusters and secondly, it can be easily trapped into local
minima. Therefore, several extensions of the FCM algorithm are proposed to improve its performance. One of
www.iosrjournals.org 16 | Page
2. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat
these algorithms is called the kernelized fuzzy c-means algorithm (KFCM) which uses kernel methods to
improve the clustering performance of the fuzzy c-means algorithm.
In KFCM a given dataset is mapped to a higher dimensional space non-linearly by a kernel function.
Thus, the newly obtained dataset is more likely to be linearly separable. However, the KFCM algorithm also has
the above-mentioned drawbacks. Additionally, there is no prior knowledge about the optimum parameters of the
kernel functions.
Paper Organization
This paper is organized as follows: Section 2 introduces related work. Section 3 introduces proposed
method. Section 4 presents clustering with hybrid ant colony optimization and fuzzy adaptive particle swarm
optimization. Section 5 presents experiments of efficiency evaluation. Section 6 gives conclusions and
directions for future work.
II. Related Work
2.1 Fuzzy c-means algorithm
Fuzzy clustering is an important crisis which is the issue of active research in several real-world
applications. The most popular fuzzy clustering technique is fuzzy c-means because of its efficient,
straightforward, and easy to implement [4]. But the fuzzy c-means algorithm suffers from the problem such as
sensitive to initialization and it can be easily attentive in local optima. But later it combines with the colony
optimization solution to obtain the efficient solution. Particle swarm optimization (PSO) is a global
optimization tool which is used in many optimization problems [10]. In this method, a hybrid fuzzy clustering
method which is based on FCM and fuzzy PSO (FPSO) is proposed which make use of the qualities of both
algorithms. Fuzzy method is used to find out the vector speed of the particle and the clustering of the data
occurs. K-means is one of the most popular hard clustering algorithms which partitions data objects into k
clusters where the number of clusters, k, is decided in advance according to application purposes. This model is
unsuitable for real data sets in which there are no definite boundaries between the clusters [2].
Kernelized fuzzy c-means is applied in noisy image segmentation for separating the noises from the image. The
segmentation is that dividing the image into the appropriate points and then clustering of the corresponding data
into groups.
2.2 Colony Optimization
Particle swarm optimization is an accompanied with the c-means clustering and ant colony
optimization, which could be implemented and applied easily to solve various function optimization problems
[9].
Fuzzy c-means clustering is an effective algorithm, it randomly selects center points and makes
iterative process finds the local optimal solution easily [12]. The algorithmic flow in PSO starts with a
population of particles whose positions represent the potential solutions for the problem, and velocities are
randomly initialized in the search space [3].
At each iteration, the search for optimal position is processed by updating the particle velocities and
positions. And the fitness value of each particle’s position is determined using a fitness function.
The extensions of ant colony optimization (ACO) to continuous domains are used. ACO, which was
initially developed to be a meta heuristic for optimization, can be adapted to continuous optimization without
any major conceptual change to its structure [6]. The extended ACO compares to those algorithms, and present
some analysis of its efficiency and robustness.
Optimization algorithms inspired by the ants foraging behavior have been initially proposed for
solving combinatorial optimization problems (COPs) [8]. Many of these problems, especially those of practical
relevance, are NP-hard. In other words, it is strongly believed that it is not possible to find efficient algorithms
to solve them optimally.
The optimization, it deals with finding optimal combinations of available problem components. Hence,
it is required that the problem is partitioned into a finite set of components, and the combinatorial optimization
algorithm attempts to find the optimal combination or permutation. Many real world optimization problems
may be represented as COPs in a straightforward way [12].
Continuous optimization is barely a new research field. There exist numerous algorithms including
metaheuristics that were developed for finding the solution to this type of problems [5]. In order to have a
proper viewpoint on the performance of ACOR, it is compared not only to other ant-related methods, but also to
other metaheuristics used for continuous optimization [7].
www.iosrjournals.org 17 | Page
3. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat
III. The Proposed Method
3.1 Overview of the proposed system
The datasets from Arrhythmia database are encoded. Preprocessing of the dataset is processed and
normalization to it is done. Then features are extracted from it. The extracted features are divided into training
sets and test sets. Then training sets are used for finding optimum cluster center and membership degrees using
the algorithm KFCM, HACO and FAPSO. Where the FAPSO are used for finding the fitness value for
calculating the cluster centers and weight vector are evaluated. Using the cluster center, the classify the test sets
as mentioned in figure 1.
3.2 Algorithm used
3.2.1 Kernelized fuzzy c-means algorithm
The kernelized fuzzy c-means algorithm uses the kernel method for clustering. The kernel method was
first implemented in support vector machine. The kernel based clustering concept is used for processing the data
sets using the Gaussian kernel function.
The search space are been Search space S is a set of continuous variables Xi, i = 1. . . n. A solution s∈
S, in which each variable has a value assigned and satisfies all the constraints in the set. It is a feasible solution
for the continuous optimization variables.
The Gaussian kernel function uses the non-linear mapping of the dataset. The kernelized fuzzy c-
means clustering algorithm is used for restating the distance function in fuzzy c-means algorithm.
3.2.2 Hybrid ant colony optimization
Ant colony optimization are used for clustering the data, however the continuous variables cannot be
processed. To overcome this problem of continuous variables the hybrid versions of ant colony optimization
have been proposed.
The hybrid ant colony optimization technique is evolved by overcoming the problem of continuous
variables. The Gaussian kernel function is used in search space for handling continuous variables.
Search space S is a set of continuous variables Xi, i = 1, . .n A solution set s∈ S, in which each variable
has a value assigned and satisfies all the constraints in the set Ω is a feasible solution of the given CNOP. A
solution s* ∈S is called global optimum if and only if: f(s*) ≤f(s). Solving a CnOP requires finding at least one
s*∈ S*.
3.2.3 Fuzzy Adaptive Particle Swarm Optimization
The algorithm works as follows
a) When the best fitness is found at the end of the run, low inertia weight and high learning factors are often
preferred.
b) When the best fitness is stayed at one value at long time, the number of generations for unchanged best
fitness is large. The inertia weight should be increased and learning should be decreased.
In FAPSO algorithm each particle is searching for the optimum value and is moving towards all
neighborhoods. Hence it has velocity for moving. Each particle remembers the position where it had its best
result. A particle has a neighborhood connected with it. A particle knows the fitness of those in its
neighborhood, and uses the position of the one with best fitness. The position is used to adjust the particle’s
velocity. Through the cluster centers are found and fitness values are obtained.
The parameters used in these algorithm are k (size of solution archive), q (locality of search process), ξ
(convergence speed, N(µ ; σ) (Gaussian function with mean and standard deviation), α (learning rate), F
(differential evolution coefficient is randomly chosen). The ECG datasets are encoded and then preprocessing
of dataset occur, where the noise are reduced. Then the normalization of the process is processed with 128 point
in beats. Here, the dataset are extracted. Then, extracted features are divided into two groups and training and
test sets are formed. By using the training set and the proposed method, optimum cluster centers and
corresponding membership degrees are found. These cluster centers and membership degrees are then used to
classify ECG beats.
www.iosrjournals.org 18 | Page
4. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat
Figure 1 Architecture of the proposed method.
3.3 Functional procedure
step 1. Generate the solution
step 2. For each particle calculate fitness value
step 3. If the fitness value is better than the best fitness value
step 4. Set the current value as the new value
step 5. Find in the particle neighborhood, the particle with the best fitness
step 6. Calculate the particle velocity according to the velocity equation
step 7. Apply the velocity construction
step 8. Update particle position according to the position equation
step 9. Sort the solutions according to the fitness in descending order
step 10. Calculate weight vector ω
step 11. Compute the σ values for the selected solutions.
step 12. Generate new solutions from selected solutions by using the computed σ values and replace new
solutions.
IV. Clustering With Haco And Fapso
In order to meaningfully restrict the number of queries that are similar to each other, one alternative is
to cluster queries in the workload based on query similarity. This can be done using a simple K-means
clustering method . Using K-means, we cluster m queries into K clusters based on a predefined K and number
of iterations.
In this paper the HACO based Kernelized Fuzzy C-Means algorithm is proposed. The solution set are
initially encoded into higher dimensional space. The solutions are encoded in the Gaussian functions such as
Gi……Gn, where n is the dimension of the problem. After adding the solution to the archive, they are sorted
according to the appropriate value. Then, the weight vector are calculated as
𝟏
ωl = 𝒒𝒌 𝟐𝝅
𝒆− 𝒍−𝟏 /𝟐𝐪𝟐𝐤𝟐
(1)
This is a value of the Gaussian function with argument l, mean 1.0 and standard deviation qk.
To update the solution, Gaussian values should been chosen according to the weight vector using equation (1).
Then Gaussian values are chose and updates in the solution set.
σ values should also be encoded to find the optimum values. The σ values are encoded using equation (2). The
optimum values are generated and by using the weight vector the fitness are generated. The solution formed and
represented in table 1.
{ci1…… cik}={si1, . . . , sik} (2)
TABLE 1
Solution set
S11 ….. S1i ….. S1n
S21 S2i S2n
. . .
. . .
. . .
Si1 ….. Sii ….. Sin
. . .
. . .
. . .
Sk1 ….. Ski ….. Skn
www.iosrjournals.org 19 | Page
5. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat
To find the fitness value the FAPSO algorithm is used. The exact values needed for clustering are
evaluated and it is updated for various values. And finally, the output of the values compared and the fit σ value
are obtained.
For the given training set the HACO algorithm is initialized. At each iteration, fitness of a solution is evaluated
and after a certain number of iterations optimum cluster centers and σ values are found.
The weight vectors are calculated. Then distance between the cluster centers and output of the algorithm are
found. This is performed using Equation (3).
W=(UTU) − UT ∗T
1
(3)
Where U represents the fuzzy partition matrix and T represents the target output matrix.
Through these weight vector the classification are formed for the test sets. The classifications are performed by
fuzzy partition matrix. The output is used for classifier.
V. Experimental Evaluation
We have evaluated each proposed model (HACO and FAPSO, KFCM) in isolation, and then compared
both these models with the combined model for efficiency. We also evaluated the efficiency of our clustering
framework.
For the considered training set, algorithm is initialized with the parameters mentioned above, and
optimum cluster centers and σ values are found. By using the obtained cluster centers and σ values, weights for
the classification stage of the proposed system are computed. Classification performance of the proposed
system is then tested over the test samples. Several experiments are performed for certain number of clusters.
Classification results for FCM and KFCM algorithms are considered as 6, 10, 15, 20, 25, 30, 35 and 40 clusters
are obtained through the FAPSO. The FAPSO algorithm finds the fitness value through several iterations by
calculating the particle velocity in the particle neighborhood.
Finally after finding the velocities the values σ is updated. By keeping σ value as constant for Gaussian
kernels of KFCM algorithm and several experiments are performed for different σ values to obtain optimum
value. Several iterations are also performed for different values of fuzzifier exponent (m) to determine the
optimum value. After experimenting several values it is chosen to be m = 2 for FCM and KFCM algorithms.
However, all the results are average results of ten experiments. It is shown that, KFCM algorithm with 15
clusters and σ = 2.4 are superior to FCM algorithm which has a total cluster number of 20. Another set of
experiments are performed with keeping σ = 1.2 and σ = 4.0 for all clusters and again searching for the
optimum centers. But, the classification performance of the proposed system decreases. Alternatively the
classification performance of the KFCM algorithm is strictly depends on the σ value. If there is not enough
number of clusters, choosing small σ values decreases the classification performance of the KFCM algorithm,
so in this case it is necessary to increase σ value to cover enough area in the feature space. In contrast, if there is
enough number of clusters, selecting large σ values decrease the classification performance. Obtained results
confirm this evaluation. Figure 2 explains the performance analysis of the clustering. Therefore, the HACO
based KFCM along with FAPSO performance better than the traditional algorithm.
Ideally, we would have preferred to compare our approach against existing clustering schemes in
databases. However, what has been addressed in literature is the use of clustering of fuzzy c-means algorithm.
Hence, we have tried to compare the proposed clustering method to indicate the effectiveness of each method
with respect to the other method.
5.1 Efficiency Evaluation
The goal of this study was to determine whether our framework can be incorporated into a real-world
application. The ECG beat are classified through the proposed method and the efficiency are evaluated.
Figure 2 performance analysis of clustering
The cluster center and membership are processed and efficient values are chosen. The efficiency
between the conventional method and the proposed method are compared. The proposed method gives the
www.iosrjournals.org 20 | Page
6. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat
sensitivity of the cluster center better when compared to the other methods. Clustering with the obtained values
gives better cluster partition. Likewise, all other dataset are clustered with the obtained cluster center and
membership values for numerous data. Figure 2 present the efficiency performance between the fuzzy c-means
and Hybrid ant colony optimization. Where, the hybrid ant colony optimization is based on the kernelized fuzzy
c-means algorithm.
VI. Conclusion
In this Paper, an optimization method is used to improve the clustering performance of the kernelized
fuzzy c-means algorithm. We proposed a combination of two different algorithms namely: Hybrid ant colony
optimization (HACO) and fuzzy adaptive particle swarm optimization (FAPSO). HACO optimizes both the
kernel function parameter and cluster centers. The proposed algorithm obtains the optimized set of cluster
centers as the output which minimizes the objective function of the traditional KFCM algorithm and this
algorithm can find the application in the areas such as web applications and classifying the ECG records. In
future the neutral network may obtain the better performance than the other algorithms.
References
[1] Berat Dogan, M.Korurek “A new ECG beat clustering method based on kernelized fuzzy c-means and hybrid ant colony
optimization for continuous domain” , Applied soft computing, 2012, pp 3442-3451.
[2] Biswal.B, P.K. Dash, S. Mishra, “A hybrid ant colony optimization technique for power signal pattern classification”, Expert
Systems with Applications, May 2011, pp 6368–6375.
[3] Dao-Qiang Zhang, Song-Can Chen, “A novel kernelized fuzzy C-means algorithm with application in medical image segmentation”,
Artificial Intelligence in Medicine, September 2004, pp 37–50.
[4] Hesam Izakian, Ajith Abraham, “Fuzzy C-means and fuzzy swarm for fuzzy clustering problem”, Expert Systems with Applications
38 (March (3)) (2011), pp 1835–1838.
[5] Ince .T, S. Kiranyaz, M. Gabbouj, “A generic and robust system for automated patient-specific classification of ECG signals”, IEEE
Transactions on Biomedical Engineering 56 (May (5)) (2009), pp 1415–1426.
[6] Jing Xiao, LiangPing Li, “A hybrid ant colony optimization for continuous domains”, Expert Systems with Applications, 2011, pp
11072–11077.
[7] Julia Handl, Bernd Meyer, “Ant-based and swarm-based clustering”, Swarm Intelligence1 (2) (2007), pp 95–113.
[8] Krzysztof Socha, Marco Dorigo, “Ant colony optimization for continuous domains”, European Journal of Operational Research ,
2008, pp 1155–1173.
[9] Niknam.T, B. Amiri, “An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis”, Applied Soft Computing
10 (January (1)) (2010), pp 183–197.
[10] Qiang Niu, Xinjian Huang, “An improved fuzzy C-means clustering algorithm based on PSO”, Journal of Software 6 (5 May)
(2011), pp 873–879.
[11] Runkler. T.A, C. Katz, IEEE International Conference on “Fuzzy Clustering by Particle Swarm Optimization”, Fuzzy Systems, 2006,
pp 601–608
[12] Wang.L, Y. Liu, X. Zhao, Y. Xu, “Particle swarm optimization for fuzzy c-means clustering”, intelligent control and automation,
The Sixth World Congress on WCICA 2006, vol. 2, 2006, pp 6055–6058.
[13] Yanfang Han, Pengfei Shi, “An improved ant colony algorithm for fuzzy clustering in image segmentation”, Neurocomputing , 2007,
pp 665–671.
[14] Yun-Chi Yeh, Wen-June Wang, Che Wun Chiou, “A novel fuzzy c-means method for classifying heartbeat cases from ECG
signals”, Measurement 43 (December (10)) (2010), pp 1542–1555.
[15] Taher Niknam, Bahman Bahmani Firouzi and Majid Nayeripour, “An Efficient Hybrid Evolutionary Algorithm for Cluster
Analysis”, World Applied Sciences Journal 4 (2), 2008, pp 300-307.
www.iosrjournals.org 21 | Page