SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
International Journal of Automation and Computing 12(5), October 2015, 511-517
DOI: 10.1007/s11633-014-0859-5
An Unsupervised Feature Selection Algorithm with
Feature Ranking for Maximizing Performance of
the Classifiers
Danasingh Asir Antony Gnana Singh1
Subramanian Appavu Alias Balamurugan2
Epiphany Jebamalar Leavline3
1
Department of Computer Science and Engineering, Bharathidasan Institute of Technology, Anna University, Tiruchirappalli 620024, India
2
Department of Information Technology, K.L.N College of Information Technology, Sivagangai 630611, India
3
Department of Electronics and Communication Engineering, Bharathidasan Institute of Technology, Anna University,
Tiruchirappalli 620024, India
Abstract: Prediction plays a vital role in decision making. Correct prediction leads to right decision making to save the life, energy,
efforts, money and time. The right decision prevents physical and material losses and it is practiced in all the fields including medical,
finance, environmental studies, engineering and emerging technologies. Prediction is carried out by a model called classifier. The
predictive accuracy of the classifier highly depends on the training datasets utilized for training the classifier. The irrelevant and
redundant features of the training dataset reduce the accuracy of the classifier. Hence, the irrelevant and redundant features must be
removed from the training dataset through the process known as feature selection. This paper proposes a feature selection algorithm
namely unsupervised learning with ranking based feature selection (FSULR). It removes redundant features by clustering and eliminates
irrelevant features by statistical measures to select the most significant features from the training dataset. The performance of this
proposed algorithm is compared with the other seven feature selection algorithms by well known classifiers namely naive Bayes (NB),
instance based (IB1) and tree based J48. Experimental results show that the proposed algorithm yields better prediction accuracy for
classifiers.
Keywords: Feature selection algorithm, classification, clustering, prediction, feature ranking, predictive model.
1 Introduction
Prediction plays a vital role in decision making. This
is done by the predictive model that is known as classi-
fier or supervised learner[1]
. This predictive model is em-
ployed to predict the unseen and unknown data in vari-
ous applications[2]
. Improving the predictive accuracy is
a major challenge among the researchers. The accuracy
of the classifier highly relies on the training dataset which
is used to train the predictive model. The irrelevant and
redundant features of the training dataset reduce the pre-
dictive accuracy[3]
. Hence, irrelevant and redundant fea-
tures must be eliminated from the training dataset for im-
proving the accuracy of the predictive model. The per-
formance improvement of the predictive model include im-
proving the predictive accuracy, reducing the time taken to
build the predictive model and reducing the number of fea-
tures present in the training dataset[3]
. Selecting significant
features from the training dataset for improving the perfor-
mance of the supervised or unsupervised learner is known
as feature selection. This feature selection is classified as
wrapper, filter, embedded and hybrid methods[4]
.
The filter method selects the important features and re-
moves the redundant and irrelevant features from the given
Regular Paper
Manuscript received November 4, 2013; accepted April 3, 2014
Recommended by Editor-in-Chief Huo-Sheng Hu
c Institute of Automation, Chinese Academy of Science and
Springer-Verlag Berlin Heidelberg 2015
dataset using mathematical statistical measure. It performs
well regardless of the classifier chosen, therefore it can work
for any type of classification algorithm[5, 6]
.
The embedded method uses the training phase of the su-
pervised learning algorithm for selecting the important fea-
tures from the training dataset. Therefore, its performance
is based on the classifiers used[7]
.
The wrapper method uses the classifier to determine im-
portant features available in the dataset[8]
. The combi-
nation of filter and wrapper method is known as hybrid
method[4, 9]
.
Many feature selection algorithms are proposed recently,
which concentrate only on removing the irrelevant features.
They do not deal with removing the redundant features
from the training dataset. These redundant features reduce
the accuracy of the classifiers. This paper proposes a filter
based feature selection algorithm named as unsupervised
learning with ranking based feature selection (FSULR).
This algorithm selects the most significant features from the
dataset and removes the redundant and irrelevant features.
The unsupervised learner learns the training dataset using
expectation maximization function and groups the features
into various clusters and these clustered features are ranked
based on the χ2
measure within the cluster. The significant
features from each cluster are chosen based on the threshold
function.
512 International Journal of Automation and Computing 12(5), October 2015
The remaining paper is organized as follows. The rele-
vant literature is reviewed in Section 2. Section 3 discusses
the proposed feature selection algorithm. Section 4 explores
and discusses the experimental results. Conclusion is drawn
in Section 5 with future enhancement of this work.
2 Related work
This section discusses various types of feature selection,
cluster and classification algorithms as a part of the related
works of this proposed algorithm.
2.1 Feature selection algorithm
The feature selection algorithm selects the most signif-
icant features from the dataset using ranking based tech-
niques, subset based and unsupervised based techniques.
In the ranking based technique, the individual features “fi”
are ranked by applying one of the mathematical measures
such as information gain, gain ratio, on the training dataset
“TD”. The ranked features are selected as significant fea-
tures for learning algorithm by a threshold value “TV ” cal-
culated by the threshold function. In subset based tech-
nique, the features of the training datasets are separated
into maximum number of possible feature subsets “S” and
each subset is evaluated by an evaluation criteria to identify
the significance of the features present in the subset. The
subset containing most significant features is considered as
a selected candidate feature subset. In unsupervised based
technique, the cluster analysis is carried out to identify the
significant features from the training dataset[10−12]
.
2.1.1 Feature selection based on correlation (FS-
Cor)
In this feature subset selection, the entire feature set
F = {f1, f2, · · · , fx} of a training dataset “TD” is sub di-
vided into feature subsets “FSi”. Then, two types of the
correlation measures are calculated on each feature subset
“FSi”. One is the feature-feature correlation that is the
correlation measure among the features present in a fea-
ture subset“FSi”, and the other one is feature-class cor-
relation that is the correlation measure between the indi-
vidual feature and the class value of the training dataset.
These two correlation measures are computed for all the in-
dividual feature subsets of the training dataset. The signif-
icant feature subset is identified based on the comparison
between the feature-feature correlation and feature−class
correlation. If the feature−class correlation value is higher
than the feature−feature correlation value, the correspond-
ing feature subset is selected as a significant feature subset
of the training dataset[13]
.
2.1.2 Feature selection based on consistency mea-
sure (FSCon)
In this feature subset selection, entire feature set “F”
of the training dataset “TD” is subdivided into maximum
possible combinations of feature subsets “FSi” and the con-
sistency measure is calculated for each subset to identify the
significant feature subset. The consistency measure ensures
that the same tuple “Ti” is not present in a different class
“Ci”[5, 14, 15]
.
2.1.3 Feature selection based on χ2
(FSChi)
This is a ranking based feature selection technique. The
χ2
statistical measure is applied on the training dataset
“TD” to identify the significance level of each feature
presents in the training dataset. The χ2
value “CV ” is
computed based on the sum of ratio of the difference be-
tween the observed (oi,j) and expected (eij ) frequencies of
the features fi, fj to the expected frequencies (eij) of the
features fi, fj of the possible instance value combinations
of the features[16]
.
2.1.4 Feature selection based on information gain
(FSInfo)
In this feature selection technique, the information gain
measure is applied on the training dataset to identify the
significant features based on information gain value of the
individual features[10]
in terms of entropy. The entropy
value of each feature of the training dataset “TD” is calcu-
lated and ranked based on the information gain value[17]
.
2.1.5 Feature selection based on gain ratio (FSGai-
ra)
In this feature selection technique, the information gain
ratio GR(f) is calculated for each feature of the training
dataset “TD” to identify the significant feature based on
the information present in the features of the “TD”[17]
.
2.1.6 ReliefF
This feature selection technique selects the significant
features from the training dataset “TD” based on the
weighted probabilistic function w(f) with nearest neighbor
principle. If the nearest neighbors of a tuple “T” belong to
the same class, it is termed as “nearest hit”. If the nearest
neighbors of a tuple “T” belong to a different class, it is
termed as “nearest miss”. The probabilistic weight func-
tion value w(f) is calculated based on “nearest hit” and
“nearest miss” values[18−20]
.
2.1.7 Feature selection based on symmetric uncer-
tainty (FSUnc)
This technique uses the correlation measure to select the
significant feature from the training dataset “TD”. In addi-
tion to that, the symmetric uncertainty “SU” is calculated
using the entropy measure to identify the similarity between
the two features fi and fj
[21, 22]
.
2.1.8 Feature selection based on unsupervised
learning algorithm
Unsupervised learning is formally known as clustering
algorithm. This algorithm groups similar objects with re-
spect to the given criteria like density, distance, etc. The
objects present in a group are highly similar than the out-
liers. This technique is applied for selecting the significant
features from the training dataset. Each unsupervised algo-
rithm has its own advantages and disadvantages that deter-
mine the application of each algorithm. This paper utilizes
expectation maximization (EM) clustering technique to se-
lect the significant features by identifying the independent
features in order to remove the redundant features from a
training dataset “TD”[23−25]
.
D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 513
2.2 Supervised learning algorithm
The supervised learning algorithm builds the predictive
model “PM” by learning the training dataset “TD” to pre-
dict the unlabeled tuple “T”. This model can be built by
various supervised learning techniques such as tree based,
probabilistic and rule based. This paper uses the supervised
learners namely naive bayes (NB), instance based IB1 and
C4.5/J48 supervised learners to evaluate and compare the
performance of the proposed feature selection algorithm in
terms of predictive accuracy and time taken to build the
prediction model with the existing algorithms[26−28]
.
2.2.1 Naive Bayes (NB) classifier
This classifier works based on the Bayesian theory. The
probabilistic function is applied on training dataset “TD”
that contains the features F = {f1, f2, · · ·, fx}, tuple T =
{t1, t2, · · ·, ty} and classes C = {c1, c2, · · ·, cz}. The un-
labeled tuple “UT” is predicted to identify its class Ci
with the probabilistic condition P(Ck|T) > P(Cw|T) where
k = w[29]
.
2.2.2 Decision tree based C4.5/J48 classifier
This tree based classifier constructs the predictive model
using decision tree. Basically, the statistical tool informa-
tion gain is used to learn the dataset and construct the
decision tree. Information gain is computed for each at-
tribute present in the training dataset “TD”. The feature
with higher information value is considered as the root node
and the dataset is divided into further levels. The infor-
mation gain value is computed for all the nodes and this
process is iterated until a single class value is obtained in
all the nodes[30, 31]
.
2.2.3 IB1
This classifier utilizes the nearest neighbor principle to
measure the similarity among the objects to predict the
unlabeled tuple “UT”. The Euclidean distance measure is
used to compute the distance between various tuples present
in the training dataset[32]
.
3 Unsupervised learning with ranking
based feature selection (FSULR)
This algorithm follows a sequence of steps to select the
significant features from the training dataset. Initially, the
training dataset is transposed. Then features are clustered
and the features in each cluster are ranked based on the
χ2
value. Then the threshold value is computed to select
highly significant features from each clustered features. All
the selected features from different clusters are combined
together as candidate significant features selected from the
training dataset.
Algorithm 1. FSULR
Input: Training dataset TD, tuple set T =
{t1, t2, · · · , tl}, feather set F = {f1, f2, · · · , fm}, class la-
bels C = {C1, C2, · · · , Cn} and T, F, C ∈ TD
Output: Selected significant feature subset SSF=
{f1,2 , · · · , fi} ∈ F
Method:
1) Begin
2) Swap (TD)
3) {Return → Swapped TD = SD }// Swap the tuple – T
(row) into feature – F (column)
4) EM Cluster (SD)
5) {Return # of grouped feature sets GF =
{GF1, GF2, GF3, · · ·, GFn}}//GF ∈ F
6) for (L = 1, L ≤ n, L + +)
7) {χ2
(GFL, C)
8) {Return ranked feature members of grouped feature
set RGFK , where K = 1, · · · , n with χ2
value CV }}//
Computing and ranking the χ2
value for the grouped fea-
tures
9) Threshold(CV Max, CV Min)//CV Max, CV Min
are maximum and minimum χ2
values, respectively
10) {Return the value “ϕ”} //ϕ is break-off threshold
value
11) for (M = 1, M ≤ n, M + +)
12) { Break off Feature (RGFM,CVM , ϕ)
13) { Return BFSSK, where K = 1, · · · , n}} //BFSSK
is the break off feature subset based on CV
14) for (N = 1, N ≤ n, N + +)
15) {Return SSF=Union(BFSSN )}//SSF is the se-
lected significant feature subset
16) End.
Phase 1. In this phase, the FSULR algorithm re-
ceives the training dataset “TD” as input with the feature
F = {f1, f2, · · · , fx}, tuples T = {t1t2, · · ·, ty} and class
C = {c1, c2, · · ·, cz}. Then the class C is removed. The
function Swap(·) swaps the features “F” into tuples “T”
and returns swapped dataset “SD” for grouping the fea-
tures “fi”.
Phase 2. In this phase, expectation maximization max-
imal likelihood function[33]
and Gaussian mixture model
as seen in (1) and (2) are used to group the features us-
ing the function EM-Cluster(·). It receives the swapped
dataset “SD” to group the “SD” into grouped feature sets
GF = {GF1, GF2, GF3, · · ·, GFn}.
η(b|Dz) = P(Dz|b) =
m
k=1
P(zk|b) =
m
k=1 l∈L
P(bk|L, b)P (L|b) (1)
where D denotes the dataset containing an unknown pa-
rameter b and known product Z and L is the unseen data.
Expectation maximization is used as distribution function
and this is derived as Gaussian mixture model multivariate
distribution P(z|L, b) as expressed in (2).
P(z|L, b) =
1
(2π)
π
2 Δ l
e− 1
2
(z−λl)Γ −1
l
(z−λl)
. (2)
With this scenario, a value α is randomly picked from
the set “S” with instances “I” from non-continuous set
N = {1, 2, 3, · · · , n}. Let λl be the mean vector of the un-
seen variable and the covariance matrix l(l = 1, 2, · · · , n)
514 International Journal of Automation and Computing 12(5), October 2015
with the discrete distribution P(l|b). The cross-validation
computes the likelihood in correspondence to the value n
to determine n Gaussian components for computing the ex-
pectation maximization. This developed Gaussian mixture
model[34]
is used for clustering the features based on the
level of Gaussian component possessed by the individual
features “fi”[35]
and return the n number of grouped fea-
ture subsets GF = {GF1, GF2, GF3, · · ·, GFn}.
Phase 3. In this phase, the most significant features are
selected from the clustered feature subset by χ2
(·) function.
It receives the clustered feature set GFK with the class fea-
ture C and computes the χ2
value “CV ” for all the features
as shown in (3)[16]
CV =
m
i=1
n
j=1
(oij − eij )
eij
. (3)
The feature “fi” contains distinct tuple values
tv1, tv2,· · ·, tvn. Then, the χ2
value “CV ” is computed for
the features fi and fj . oij is the observed frequency and
eij is the expected frequency of possible tuple values. To
identify the significant features, the features are ranked by
their χ2
value “CV ” and the ranked grouped feature set is
obtained. The Threshold(·) function receives the CV Min
and CV Max values and returns the break-off threshold
value ϕ as shown in (4). The function Break off Feature()
recognizes the ranked grouped feature set RGF with ϕ and
breaks off the top ranked “CV ” up to the value ϕ and
returns them as break off feature subset BFSS.
ϕ = H ×
α − β
2
+ β (4)
where ϕ is break off-threshold value, H is threshold constant
that is equal to 1, α is the maximum χ2
value (CV Max)
of the feature, and β is the minimum χ2
value (CV Min)
of the feature.
Phase 4. This phase combines break off feature subset
BFSS threshold from the clustered features by the func-
tion Union(·) and produces the selected significant feature
subset SSF as the candidate selected features.
4 Experimental setup
Totally, 21 datasets are used to conduct the experiment
with number of features ranging from 2 to 684, number of
instances from 5 to 1728 and number of class labels from 2
to 24 as listed in the Table 1. These datasets are taken from
the Weka software[36]
and UCI Repository[37]
. The perfor-
mance of the proposed algorithm FSULR is analyzed and
compared with the other feature selection algorithms FS-
Cor, FSChi, FSCon, FSGai-ra, ReliefF, FSUnc and FSInfo
with NB, J48 and IB1 classifiers.
5 Experimental procedure, results and
discussion
This experiment is conducted with 21 well known publi-
cally available datasets as shown in the Table 1 by 7 feature
selection algorithms namely FSCor, FSChi, FSCon, FSGai-
ra, ReliefF, FSUnc and FSInfo. Initially, the datasets are
fed as an input to the feature selection algorithms and the
selected features are obtained as output of the feature se-
lection algorithms. These selected features are given to the
Table 1 Datasets
Serial number Dataset Number of features Number of instances Number of classes
1 Breast cancer 9 286 2
2 Contact lenses 24 5 5
3 Credit–g 20 1000 2
4 Diabetes 8 768 2
5 Glass 9 214 7
6 Ionosphere 34 351 2
7 Iris 2D 2 150 3
8 Iris 4 150 3
9 Labor 16 57 2
10 Segment challenge 19 1500 7
11 Soybean 683 36 14
12 Vote 435 17 3
13 Weather nominal 4 14 2
14 Weather numeric 14 5 2
15 Audiology 69 226 24
16 Car 6 1728 4
17 Cylinder bands 39 540 2
18 Dermatology 34 366 3
19 Ecoli 7 336 8
20 Anneal 39 898 6
21 Hay-train 10 373 9
D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 515
NB, J48 and IB1 classifiers and the predictive accuracy and
the time taken to build the predictive model are calculated
with the 10-fold cross validation test mode.
The performance of the proposed FSULR algorithm is
analyzed in terms of predictive accuracy, time to build the
model and the number of features reduced. Fig. 1 expresses
the overall performance of feature selection algorithms in
producing the predictive accuracy for supervised learners
and it is observed that the FSLUR performs better than
all the other feature selection algorithms compared. The
second and third positions are retained by the FSCon and
FSInfo respectively. Fig. 2 shows that the proposed FSULR
achieves better accuracy than the other algorithms com-
pared for NB and IB1 classifiers. The FSCon achieves bet-
ter results for J48 classifier as compared to all the other
classifiers.
Fig. 1 Comparison of overall prediction accuracy in percentage
with respective feature selection algorithm
Fig. 2 Comparison of prediction accuracy in percentage with
respect to the feature selection algorithm and classifier
From Fig.3, it is evident that FSULR takes more time
to build the predictive model compared to all other algo-
rithms. From Fig.4, it is observed that the FSGai-ra and
FSUnc require less time to build the model for NB classi-
fier than other algorithms compared. The FSUnc takes less
time to build model for J48 classifier. The FSGai-ra and
ReliefF consume less time to build the predictive model for
IB1 classifier. Fig.5 exhibits that ReliefF reduces the num-
ber of features significantly than other algorithms compared
and FSCon reduces the least number of features.
Fig. 3 Comparison of overall time taken to build the predictive
model (in second) with respect to the feature selection algorithm
Fig. 4 Comparison of time taken to build the predictive model
(in second) with respect to the feature selection algorithm and
classifier
Fig. 5 Comparison of number of features reduction with respect
to the feature selection algorithm
6 Conclusions and future work
This paper proposed a feature selection algorithm namely
unsupervised learning with ranking based feature selection
algorithm (FSULR). Performance of this algorithm is an-
516 International Journal of Automation and Computing 12(5), October 2015
alyzed in terms of accuracy produced for classifier, time
taken to build predictive model and feature reduction. The
performance of this algorithm is compared with other fea-
ture selection algorithms namely correlation based FSCor
χ2
based FSChi, consistency based FSCon, gain ratio based
FSGai-ra, ReliefF, symmetric uncertainty based FSUnc and
information gain based FSInfo algorithm with NB, J48, IB1
classifiers. The FSULR achieves better prediction accuracy
than the other feature selection algorithms and achieves
higher accuracy for IB1 and NB classifiers compared to the
other feature selection algorithms. The FSULR is also con-
siderably good in reducing the time to build model for IB1
compared to FSGai-ra and ReliefF. FSULR is considerably
good in reducing the time to build model for J48 classi-
fiers compared to FSCon algorithm. The FSULR reduces
the number of features compared to FSCor, FSCon and
FSGai-ra. In future, this work can be extended with other
statistical measures for ranking the features within the clus-
ters.
References
[1] J. Sinno, Q. Y. Pan. A survey on transfer learning. IEEE
Transactions on Knowledge and Data Engineering, vol. 22,
no. 10, pp. 1345–135, 2010.
[2] M. R. Rashedur, R. M. Fazle. Using and comparing different
decision tree classification techniques for mining ICDDR, B
Hospital Surveillance data. Expert Systems with Applica-
tions, vol. 38, no. 9, pp. 11421–11436, 2011.
[3] M. Wasikowski, X. W. Chen. Combating the small sam-
ple class imbalance problem using feature selection. IEEE
Transactions on Knowledge and Data Engineering, vol. 22,
no. 10, pp. 1388–1400, 2010.
[4] Q. B. Song, J. J. Ni, G. T. Wang. A fast clustering-based
feature subset selection algorithm for high-dimensional
data. IEEE Transactions on Knowledge and Data Engineer-
ing, vol. l5, no. 1, pp. 1–14, 2013.
[5] J. F. Artur, A. T. M´ario. Efficient feature selection filters for
high-dimensional data. Pattern Recognition Letters, vol. 33,
no. 13, pp. 1794–1804, 2012.
[6] J. Wu, L. Chen, Y. P. Feng, Z. B. Zheng, M. C. Zhou,
Z. H. Wu. Predicting quality of service for selection by
neighborhood-based collaborative filtering. IEEE Transac-
tions on Systems, Man, and Cybernetics: Systems, vol. 43,
no. 2, pp. 428–439, 2012.
[7] C. P. Hou, F. P. Nie, X. Li, D. Yi, Y. Wu. Joint embedding
learning and sparse regression: A framework for unsuper-
vised feature selection. IEEE Transactions on Cybernetics,
vol. 44, no. 6, pp. 793–804, 2014.
[8] P. Bermejo, L. dela Ossa, J. A. G´amez, J. M. Puerta.
Fast wrapper feature subset selection in high-dimensional
datasets by means of filter re-ranking Original Research
Article. Knowledge-based Systems, vol. 25, no. 1, pp. 35–44,
2012.
[9] S. Atulji, G. Shameek, V. K. Jayaramanb. Hybrid biogeog-
raphy based simultaneous feature selection and MHC class
I peptide binding prediction using support vector machines
and random forests. Journal of Immunological Methods,
vol. 387, no. 1-2, pp. 284–292, 2013.
[10] H. L. Wei, S. Billings. Feature subset selection and ranking
for data dimensionality reduction. IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 29, no. 1,
pp. 162–166, 2007.
[11] X. V. Nguyen, B. James. Comments on supervised fea-
ture selection by clustering using conditional mutual
information-based distances. Pattern Recognition, vol. 46,
no. 4, pp. 1220–1225, 2013.
[12] A. G. Iffat, S. S. Leslie. Feature subset selection in large di-
mensionality domains. Pattern Recognition, vol. 43, no. 1,
pp. 5–13, 2010.
[13] M. Hall. Correlation-based Feature Selection for Machine
Learning, Ph. D dissertation, The University of Waikato,
New Zealond, 1999.
[14] Y. Lei, L. Huan. Efficient feature selection via analysis of
relevance and redundancy. Journal of Machine Learning Re-
search, vol. 5, no. 1, pp. 1205–1224, 2004.
[15] M. Dash, H. Liu, H. Motoda. Consistency based feature se-
lection. In Proceedings of the 4th Pacific Asia Conference
on Knowledge Discovery and Data Mining, Kyoto, Japan,
pp. 98–109, 2000.
[16] H. Peng, L. Fulmi, C. Ding. Feature selection based
on mutual information criteria of max-dependency, max-
relevance, and min-redundancy. IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 27, no. 8,
pp. 1226–1238, 2005.
[17] H. Uˇguz. A two-stage feature selection method for text cat-
egorization by using information gain, principal component
analysis and genetic algorithm. Knowledge-based Systems,
vol. 24, no. 7, pp. 1024–1032, 2011.
[18] M. Robnik-ˇSikonja, I. Kononenko. Theoretical and empir-
ical analysis of ReliefF and RReliefF. Machine Learning,
vol. 53, no. 1-2, pp. 23–69, 2003.
[19] S. Yijun, T. Sinisa, G. Steve. Local-learning-based feature
selection for high-dimensional data analysis. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence, vol. 32,
no. 9, pp. 1610–1626, 2010.
[20] W. Peng, S. Cesar, S. Edward. Prediction based on in-
tegration of decisional DNA and a feature selection algo-
rithm RELIEF-F. Cybernetics and Systems, vol. 44, no. 3,
pp. 173–183, 2013.
D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 517
[21] H. W. Liu, J. G. Sun, L. Liu, H. J. Zhang. Feature selec-
tion with dynamic mutual information. Pattern Recogni-
tion, vol. 42, no. 7, pp. 1330–1339, 2009.
[22] M. C. Lee. Using support vector machine with a hybrid fea-
ture selection method to the stock trend prediction. Expert
Systems with Applications, vol. 36, no. 8, pp. 10896–10904,
2009.
[23] P. Mitra, C. A. Murthy, S. K. Pal. Unsupervised feature
selection using feature similarity. IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 24, no. 3,
pp. 301–312, 2002.
[24] J. Handl, J. Knowles. Feature subset selection in unsu-
pervised learning via multi objective optimization. Inter-
national Journal of Computational Intelligence Research,
vol. 2, no. 3, pp. 217–238, 2006.
[25] H. Liu, L. Yu. Toward integrating feature selection algo-
rithms for classification and clustering. IEEE Transactions
on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–
502, 2005.
[26] S. Garc´ıa, J. Luengo, J. A. S´aez, V. L´oez, F. Herrera. A
survey of discretization techniques: Taxonomy and empir-
ical analysis in supervised learning. IEEE Transactions on
Knowledge and Data Engineering, vol. 25, no. 4, pp. 734–
750, 2013.
[27] S. A. A. Balamurugan, R. Rajaram. Effective and efficient
feature selection for large-scale data using Bayes theorem.
International Journal of Automation and Computing, vol. 6,
no. 1, pp. 62–71, 2009.
[28] J. A. Mangai, V. S. Kumar, S. A. alias Balamurugan.
A novel feature selection framework for automatic web
page classification. International Journal of Automation
and Computing, vol. 9, no. 4, pp. 442–448, 2012.
[29] H. J. Huang, C. N. Hsu. Bayesian classification for data
from the same unknown class. IEEE Transactions on Sys-
tems, Man, and Cybernetics, Part B: Cybernetic, vol. 32,
no. 2, pp. 137–145, 2002.
[30] S. Ruggieri. Efficient C4.5. IEEE Transactions on Knowl-
edge and Data Engineering, vol. 14, no. 2, pp. 438–444,
2002.
[31] P. Kemal, G. Salih. A novel hybrid intelligent method based
on C4.5 decision tree classifier and one-against-all approach
for multi-class classification problems. Expert Systems with
Applications, vol. 36, no. 2, pp. 1587–1592, 2009.
[32] W. W. Cheng, E. H¨ullermeier. Combining instance-based
learning and logistic regression for multilabel classification.
Machine Learning, vol. 76, no. 3, pp. 211–225, 2009.
[33] J. H. Hsiu, P. Saumyadipta, I. L. Tsung. Maximum likeli-
hood inference for mixtures of skew Student-t-normal dis-
tributions through practical EM-type algorithms. Statistics
and Computing, vol. 22, no. 1, pp. 287–299, 2012.
[34] M. H. C. Law, M. A. T. Jain, A. K. Jain. Simultaneous fea-
ture selection and clustering using mixture models. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
vol. 26, no. 9, pp. 1154–1166, 2004.
[35] T. W. Lee, M. S. Lewicki, T. J. Sejnowski. ICA mix-
ture models for unsupervised classification of non-gaussian
classes and automatic context switching in blind signal sep-
aration. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence, vol. 22, no. 10, pp. 1078–1089, 2002.
[36] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reute-
mann. The WEKA data mining software: An update. ACM
SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18,
2009.
[37] K. Bache, M. Lichman. UCI Machine Learning Repository,
[Online], Available: http://archive.ics.uci.edu/ml, Irvine,
CA: University of California, School of Information and
Computer Science, 2013.
Danasingh Asir Antony Gnana Singh
received the M. Eng. and B. Eng. degrees
from Anna University, India. He is cur-
rently working as a teaching fellow in the
Department of Computer Science and En-
gineering, Bharathidasan Institute of Tech-
nology, Anna University, India.
His research interests include data min-
ing, wireless networks, parallel computing,
mobile computing, computer networks, image processing, soft-
ware engineering, soft computing, teaching learning process and
engineering education.
E-mail: asirantony@gmail.com (Corresponding author)
ORCID iD: 0000-0002-4913-2886
Subramanian Appavu Alias Balamu-
rugan received the Ph. D. degree from
Anna University, India. He is currently
working as a professor and the head of the
Department of Information Technology in
K.L.N College of Information Technology,
India.
His research interests include pattern
recognition, data mining and informatics.
E-mail: app s@yahoo.com
Epiphany Jebamalar Leavline re-
ceived the M. Eng. and B. Eng. degrees
from Anna University, India, and received
the MBA degree from Alagappa Univer-
sity, India. She is currently working as
an assistant professor in the Department of
Electronics and Communication Engineer-
ing, Bharathidasan Institute of Technology,
Anna University, India.
Her research interests include image processing, signal pro-
cessing, VLSI design, data mining, teaching learning process and
engineering education.
E-mail: jebi.lee@gmail.com

Weitere ähnliche Inhalte

Was ist angesagt?

Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
Editor IJARCET
 

Was ist angesagt? (17)

Study on Relavance Feature Selection Methods
Study on Relavance Feature Selection MethodsStudy on Relavance Feature Selection Methods
Study on Relavance Feature Selection Methods
 
IRJET-Comparison between Supervised Learning and Unsupervised Learning
IRJET-Comparison between Supervised Learning and Unsupervised LearningIRJET-Comparison between Supervised Learning and Unsupervised Learning
IRJET-Comparison between Supervised Learning and Unsupervised Learning
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
T0 numtq0n tk=
T0 numtq0n tk=T0 numtq0n tk=
T0 numtq0n tk=
 
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Cw36587594
Cw36587594Cw36587594
Cw36587594
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
 
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
C LUSTERING B ASED A TTRIBUTE S UBSET S ELECTION U SING F AST A LGORITHm
C LUSTERING  B ASED  A TTRIBUTE  S UBSET  S ELECTION  U SING  F AST  A LGORITHmC LUSTERING  B ASED  A TTRIBUTE  S UBSET  S ELECTION  U SING  F AST  A LGORITHm
C LUSTERING B ASED A TTRIBUTE S UBSET S ELECTION U SING F AST A LGORITHm
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Ijmet 10 01_141
Ijmet 10 01_141Ijmet 10 01_141
Ijmet 10 01_141
 
Applying supervised and un supervised learning approaches for movie recommend...
Applying supervised and un supervised learning approaches for movie recommend...Applying supervised and un supervised learning approaches for movie recommend...
Applying supervised and un supervised learning approaches for movie recommend...
 
Ijciet 10 02_070
Ijciet 10 02_070Ijciet 10 02_070
Ijciet 10 02_070
 
Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...
 

Andere mochten auch

How to give a powerful presentation
How to give a powerful presentationHow to give a powerful presentation
How to give a powerful presentation
Whitney Davis
 
Web 2
Web 2Web 2
Web 2
mcoym
 
聲調教學
聲調教學聲調教學
聲調教學
boo2006x
 

Andere mochten auch (17)

2011 Annual Report Chamber
2011 Annual Report Chamber2011 Annual Report Chamber
2011 Annual Report Chamber
 
2011 OA Annual report
2011 OA Annual report2011 OA Annual report
2011 OA Annual report
 
Developing Wayfinder Signage, Penn Dot
Developing Wayfinder Signage, Penn DotDeveloping Wayfinder Signage, Penn Dot
Developing Wayfinder Signage, Penn Dot
 
G200 l presentation
G200 l presentationG200 l presentation
G200 l presentation
 
Innovation Workshop
Innovation WorkshopInnovation Workshop
Innovation Workshop
 
ادارة الوقت
ادارة الوقتادارة الوقت
ادارة الوقت
 
Peninsula way
Peninsula wayPeninsula way
Peninsula way
 
Critical Thinking: From Nest to Net
Critical Thinking: From Nest to NetCritical Thinking: From Nest to Net
Critical Thinking: From Nest to Net
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Ga change management
Ga change managementGa change management
Ga change management
 
Arup Urban Informatics Workshop Sep 2010
Arup Urban Informatics Workshop Sep 2010Arup Urban Informatics Workshop Sep 2010
Arup Urban Informatics Workshop Sep 2010
 
Zero Carbon Isn’t Really Zero: Why Embodied Carbon in Materials Can’t Be Ignored
Zero Carbon Isn’t Really Zero: Why Embodied Carbon in Materials Can’t Be IgnoredZero Carbon Isn’t Really Zero: Why Embodied Carbon in Materials Can’t Be Ignored
Zero Carbon Isn’t Really Zero: Why Embodied Carbon in Materials Can’t Be Ignored
 
M comp
M compM comp
M comp
 
GCE O' Level 1123 Examiner's Report Sum up
GCE O' Level 1123 Examiner's Report Sum upGCE O' Level 1123 Examiner's Report Sum up
GCE O' Level 1123 Examiner's Report Sum up
 
How to give a powerful presentation
How to give a powerful presentationHow to give a powerful presentation
How to give a powerful presentation
 
Web 2
Web 2Web 2
Web 2
 
聲調教學
聲調教學聲調教學
聲調教學
 

Ähnlich wie An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers

Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithmIaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd Iaetsd
 

Ähnlich wie An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers (20)

Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...
 
On Feature Selection Algorithms and Feature Selection Stability Measures : A ...
On Feature Selection Algorithms and Feature Selection Stability Measures : A ...On Feature Selection Algorithms and Feature Selection Stability Measures : A ...
On Feature Selection Algorithms and Feature Selection Stability Measures : A ...
 
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...
 
On Feature Selection Algorithms and Feature Selection Stability Measures : A...
 On Feature Selection Algorithms and Feature Selection Stability Measures : A... On Feature Selection Algorithms and Feature Selection Stability Measures : A...
On Feature Selection Algorithms and Feature Selection Stability Measures : A...
 
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksA Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
 
An experimental study on hypothyroid using rotation forest
An experimental study on hypothyroid using rotation forestAn experimental study on hypothyroid using rotation forest
An experimental study on hypothyroid using rotation forest
 
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithmIaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithm
 
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILESAN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
 
An efficient feature selection in
An efficient feature selection inAn efficient feature selection in
An efficient feature selection in
 
Booster in High Dimensional Data Classification
Booster in High Dimensional Data ClassificationBooster in High Dimensional Data Classification
Booster in High Dimensional Data Classification
 
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
 
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
 
Unsupervised Feature Selection Based on the Distribution of Features Attribut...
Unsupervised Feature Selection Based on the Distribution of Features Attribut...Unsupervised Feature Selection Based on the Distribution of Features Attribut...
Unsupervised Feature Selection Based on the Distribution of Features Attribut...
 
06522405
0652240506522405
06522405
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
 
Artificial Intelligence based Pattern Recognition
Artificial Intelligence based Pattern RecognitionArtificial Intelligence based Pattern Recognition
Artificial Intelligence based Pattern Recognition
 
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
 
Feature selection a novel
Feature selection a novelFeature selection a novel
Feature selection a novel
 
Research proposal
Research proposalResearch proposal
Research proposal
 
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
 

Kürzlich hochgeladen

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 

Kürzlich hochgeladen (20)

Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 

An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers

  • 1. International Journal of Automation and Computing 12(5), October 2015, 511-517 DOI: 10.1007/s11633-014-0859-5 An Unsupervised Feature Selection Algorithm with Feature Ranking for Maximizing Performance of the Classifiers Danasingh Asir Antony Gnana Singh1 Subramanian Appavu Alias Balamurugan2 Epiphany Jebamalar Leavline3 1 Department of Computer Science and Engineering, Bharathidasan Institute of Technology, Anna University, Tiruchirappalli 620024, India 2 Department of Information Technology, K.L.N College of Information Technology, Sivagangai 630611, India 3 Department of Electronics and Communication Engineering, Bharathidasan Institute of Technology, Anna University, Tiruchirappalli 620024, India Abstract: Prediction plays a vital role in decision making. Correct prediction leads to right decision making to save the life, energy, efforts, money and time. The right decision prevents physical and material losses and it is practiced in all the fields including medical, finance, environmental studies, engineering and emerging technologies. Prediction is carried out by a model called classifier. The predictive accuracy of the classifier highly depends on the training datasets utilized for training the classifier. The irrelevant and redundant features of the training dataset reduce the accuracy of the classifier. Hence, the irrelevant and redundant features must be removed from the training dataset through the process known as feature selection. This paper proposes a feature selection algorithm namely unsupervised learning with ranking based feature selection (FSULR). It removes redundant features by clustering and eliminates irrelevant features by statistical measures to select the most significant features from the training dataset. The performance of this proposed algorithm is compared with the other seven feature selection algorithms by well known classifiers namely naive Bayes (NB), instance based (IB1) and tree based J48. Experimental results show that the proposed algorithm yields better prediction accuracy for classifiers. Keywords: Feature selection algorithm, classification, clustering, prediction, feature ranking, predictive model. 1 Introduction Prediction plays a vital role in decision making. This is done by the predictive model that is known as classi- fier or supervised learner[1] . This predictive model is em- ployed to predict the unseen and unknown data in vari- ous applications[2] . Improving the predictive accuracy is a major challenge among the researchers. The accuracy of the classifier highly relies on the training dataset which is used to train the predictive model. The irrelevant and redundant features of the training dataset reduce the pre- dictive accuracy[3] . Hence, irrelevant and redundant fea- tures must be eliminated from the training dataset for im- proving the accuracy of the predictive model. The per- formance improvement of the predictive model include im- proving the predictive accuracy, reducing the time taken to build the predictive model and reducing the number of fea- tures present in the training dataset[3] . Selecting significant features from the training dataset for improving the perfor- mance of the supervised or unsupervised learner is known as feature selection. This feature selection is classified as wrapper, filter, embedded and hybrid methods[4] . The filter method selects the important features and re- moves the redundant and irrelevant features from the given Regular Paper Manuscript received November 4, 2013; accepted April 3, 2014 Recommended by Editor-in-Chief Huo-Sheng Hu c Institute of Automation, Chinese Academy of Science and Springer-Verlag Berlin Heidelberg 2015 dataset using mathematical statistical measure. It performs well regardless of the classifier chosen, therefore it can work for any type of classification algorithm[5, 6] . The embedded method uses the training phase of the su- pervised learning algorithm for selecting the important fea- tures from the training dataset. Therefore, its performance is based on the classifiers used[7] . The wrapper method uses the classifier to determine im- portant features available in the dataset[8] . The combi- nation of filter and wrapper method is known as hybrid method[4, 9] . Many feature selection algorithms are proposed recently, which concentrate only on removing the irrelevant features. They do not deal with removing the redundant features from the training dataset. These redundant features reduce the accuracy of the classifiers. This paper proposes a filter based feature selection algorithm named as unsupervised learning with ranking based feature selection (FSULR). This algorithm selects the most significant features from the dataset and removes the redundant and irrelevant features. The unsupervised learner learns the training dataset using expectation maximization function and groups the features into various clusters and these clustered features are ranked based on the χ2 measure within the cluster. The significant features from each cluster are chosen based on the threshold function.
  • 2. 512 International Journal of Automation and Computing 12(5), October 2015 The remaining paper is organized as follows. The rele- vant literature is reviewed in Section 2. Section 3 discusses the proposed feature selection algorithm. Section 4 explores and discusses the experimental results. Conclusion is drawn in Section 5 with future enhancement of this work. 2 Related work This section discusses various types of feature selection, cluster and classification algorithms as a part of the related works of this proposed algorithm. 2.1 Feature selection algorithm The feature selection algorithm selects the most signif- icant features from the dataset using ranking based tech- niques, subset based and unsupervised based techniques. In the ranking based technique, the individual features “fi” are ranked by applying one of the mathematical measures such as information gain, gain ratio, on the training dataset “TD”. The ranked features are selected as significant fea- tures for learning algorithm by a threshold value “TV ” cal- culated by the threshold function. In subset based tech- nique, the features of the training datasets are separated into maximum number of possible feature subsets “S” and each subset is evaluated by an evaluation criteria to identify the significance of the features present in the subset. The subset containing most significant features is considered as a selected candidate feature subset. In unsupervised based technique, the cluster analysis is carried out to identify the significant features from the training dataset[10−12] . 2.1.1 Feature selection based on correlation (FS- Cor) In this feature subset selection, the entire feature set F = {f1, f2, · · · , fx} of a training dataset “TD” is sub di- vided into feature subsets “FSi”. Then, two types of the correlation measures are calculated on each feature subset “FSi”. One is the feature-feature correlation that is the correlation measure among the features present in a fea- ture subset“FSi”, and the other one is feature-class cor- relation that is the correlation measure between the indi- vidual feature and the class value of the training dataset. These two correlation measures are computed for all the in- dividual feature subsets of the training dataset. The signif- icant feature subset is identified based on the comparison between the feature-feature correlation and feature−class correlation. If the feature−class correlation value is higher than the feature−feature correlation value, the correspond- ing feature subset is selected as a significant feature subset of the training dataset[13] . 2.1.2 Feature selection based on consistency mea- sure (FSCon) In this feature subset selection, entire feature set “F” of the training dataset “TD” is subdivided into maximum possible combinations of feature subsets “FSi” and the con- sistency measure is calculated for each subset to identify the significant feature subset. The consistency measure ensures that the same tuple “Ti” is not present in a different class “Ci”[5, 14, 15] . 2.1.3 Feature selection based on χ2 (FSChi) This is a ranking based feature selection technique. The χ2 statistical measure is applied on the training dataset “TD” to identify the significance level of each feature presents in the training dataset. The χ2 value “CV ” is computed based on the sum of ratio of the difference be- tween the observed (oi,j) and expected (eij ) frequencies of the features fi, fj to the expected frequencies (eij) of the features fi, fj of the possible instance value combinations of the features[16] . 2.1.4 Feature selection based on information gain (FSInfo) In this feature selection technique, the information gain measure is applied on the training dataset to identify the significant features based on information gain value of the individual features[10] in terms of entropy. The entropy value of each feature of the training dataset “TD” is calcu- lated and ranked based on the information gain value[17] . 2.1.5 Feature selection based on gain ratio (FSGai- ra) In this feature selection technique, the information gain ratio GR(f) is calculated for each feature of the training dataset “TD” to identify the significant feature based on the information present in the features of the “TD”[17] . 2.1.6 ReliefF This feature selection technique selects the significant features from the training dataset “TD” based on the weighted probabilistic function w(f) with nearest neighbor principle. If the nearest neighbors of a tuple “T” belong to the same class, it is termed as “nearest hit”. If the nearest neighbors of a tuple “T” belong to a different class, it is termed as “nearest miss”. The probabilistic weight func- tion value w(f) is calculated based on “nearest hit” and “nearest miss” values[18−20] . 2.1.7 Feature selection based on symmetric uncer- tainty (FSUnc) This technique uses the correlation measure to select the significant feature from the training dataset “TD”. In addi- tion to that, the symmetric uncertainty “SU” is calculated using the entropy measure to identify the similarity between the two features fi and fj [21, 22] . 2.1.8 Feature selection based on unsupervised learning algorithm Unsupervised learning is formally known as clustering algorithm. This algorithm groups similar objects with re- spect to the given criteria like density, distance, etc. The objects present in a group are highly similar than the out- liers. This technique is applied for selecting the significant features from the training dataset. Each unsupervised algo- rithm has its own advantages and disadvantages that deter- mine the application of each algorithm. This paper utilizes expectation maximization (EM) clustering technique to se- lect the significant features by identifying the independent features in order to remove the redundant features from a training dataset “TD”[23−25] .
  • 3. D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 513 2.2 Supervised learning algorithm The supervised learning algorithm builds the predictive model “PM” by learning the training dataset “TD” to pre- dict the unlabeled tuple “T”. This model can be built by various supervised learning techniques such as tree based, probabilistic and rule based. This paper uses the supervised learners namely naive bayes (NB), instance based IB1 and C4.5/J48 supervised learners to evaluate and compare the performance of the proposed feature selection algorithm in terms of predictive accuracy and time taken to build the prediction model with the existing algorithms[26−28] . 2.2.1 Naive Bayes (NB) classifier This classifier works based on the Bayesian theory. The probabilistic function is applied on training dataset “TD” that contains the features F = {f1, f2, · · ·, fx}, tuple T = {t1, t2, · · ·, ty} and classes C = {c1, c2, · · ·, cz}. The un- labeled tuple “UT” is predicted to identify its class Ci with the probabilistic condition P(Ck|T) > P(Cw|T) where k = w[29] . 2.2.2 Decision tree based C4.5/J48 classifier This tree based classifier constructs the predictive model using decision tree. Basically, the statistical tool informa- tion gain is used to learn the dataset and construct the decision tree. Information gain is computed for each at- tribute present in the training dataset “TD”. The feature with higher information value is considered as the root node and the dataset is divided into further levels. The infor- mation gain value is computed for all the nodes and this process is iterated until a single class value is obtained in all the nodes[30, 31] . 2.2.3 IB1 This classifier utilizes the nearest neighbor principle to measure the similarity among the objects to predict the unlabeled tuple “UT”. The Euclidean distance measure is used to compute the distance between various tuples present in the training dataset[32] . 3 Unsupervised learning with ranking based feature selection (FSULR) This algorithm follows a sequence of steps to select the significant features from the training dataset. Initially, the training dataset is transposed. Then features are clustered and the features in each cluster are ranked based on the χ2 value. Then the threshold value is computed to select highly significant features from each clustered features. All the selected features from different clusters are combined together as candidate significant features selected from the training dataset. Algorithm 1. FSULR Input: Training dataset TD, tuple set T = {t1, t2, · · · , tl}, feather set F = {f1, f2, · · · , fm}, class la- bels C = {C1, C2, · · · , Cn} and T, F, C ∈ TD Output: Selected significant feature subset SSF= {f1,2 , · · · , fi} ∈ F Method: 1) Begin 2) Swap (TD) 3) {Return → Swapped TD = SD }// Swap the tuple – T (row) into feature – F (column) 4) EM Cluster (SD) 5) {Return # of grouped feature sets GF = {GF1, GF2, GF3, · · ·, GFn}}//GF ∈ F 6) for (L = 1, L ≤ n, L + +) 7) {χ2 (GFL, C) 8) {Return ranked feature members of grouped feature set RGFK , where K = 1, · · · , n with χ2 value CV }}// Computing and ranking the χ2 value for the grouped fea- tures 9) Threshold(CV Max, CV Min)//CV Max, CV Min are maximum and minimum χ2 values, respectively 10) {Return the value “ϕ”} //ϕ is break-off threshold value 11) for (M = 1, M ≤ n, M + +) 12) { Break off Feature (RGFM,CVM , ϕ) 13) { Return BFSSK, where K = 1, · · · , n}} //BFSSK is the break off feature subset based on CV 14) for (N = 1, N ≤ n, N + +) 15) {Return SSF=Union(BFSSN )}//SSF is the se- lected significant feature subset 16) End. Phase 1. In this phase, the FSULR algorithm re- ceives the training dataset “TD” as input with the feature F = {f1, f2, · · · , fx}, tuples T = {t1t2, · · ·, ty} and class C = {c1, c2, · · ·, cz}. Then the class C is removed. The function Swap(·) swaps the features “F” into tuples “T” and returns swapped dataset “SD” for grouping the fea- tures “fi”. Phase 2. In this phase, expectation maximization max- imal likelihood function[33] and Gaussian mixture model as seen in (1) and (2) are used to group the features us- ing the function EM-Cluster(·). It receives the swapped dataset “SD” to group the “SD” into grouped feature sets GF = {GF1, GF2, GF3, · · ·, GFn}. η(b|Dz) = P(Dz|b) = m k=1 P(zk|b) = m k=1 l∈L P(bk|L, b)P (L|b) (1) where D denotes the dataset containing an unknown pa- rameter b and known product Z and L is the unseen data. Expectation maximization is used as distribution function and this is derived as Gaussian mixture model multivariate distribution P(z|L, b) as expressed in (2). P(z|L, b) = 1 (2π) π 2 Δ l e− 1 2 (z−λl)Γ −1 l (z−λl) . (2) With this scenario, a value α is randomly picked from the set “S” with instances “I” from non-continuous set N = {1, 2, 3, · · · , n}. Let λl be the mean vector of the un- seen variable and the covariance matrix l(l = 1, 2, · · · , n)
  • 4. 514 International Journal of Automation and Computing 12(5), October 2015 with the discrete distribution P(l|b). The cross-validation computes the likelihood in correspondence to the value n to determine n Gaussian components for computing the ex- pectation maximization. This developed Gaussian mixture model[34] is used for clustering the features based on the level of Gaussian component possessed by the individual features “fi”[35] and return the n number of grouped fea- ture subsets GF = {GF1, GF2, GF3, · · ·, GFn}. Phase 3. In this phase, the most significant features are selected from the clustered feature subset by χ2 (·) function. It receives the clustered feature set GFK with the class fea- ture C and computes the χ2 value “CV ” for all the features as shown in (3)[16] CV = m i=1 n j=1 (oij − eij ) eij . (3) The feature “fi” contains distinct tuple values tv1, tv2,· · ·, tvn. Then, the χ2 value “CV ” is computed for the features fi and fj . oij is the observed frequency and eij is the expected frequency of possible tuple values. To identify the significant features, the features are ranked by their χ2 value “CV ” and the ranked grouped feature set is obtained. The Threshold(·) function receives the CV Min and CV Max values and returns the break-off threshold value ϕ as shown in (4). The function Break off Feature() recognizes the ranked grouped feature set RGF with ϕ and breaks off the top ranked “CV ” up to the value ϕ and returns them as break off feature subset BFSS. ϕ = H × α − β 2 + β (4) where ϕ is break off-threshold value, H is threshold constant that is equal to 1, α is the maximum χ2 value (CV Max) of the feature, and β is the minimum χ2 value (CV Min) of the feature. Phase 4. This phase combines break off feature subset BFSS threshold from the clustered features by the func- tion Union(·) and produces the selected significant feature subset SSF as the candidate selected features. 4 Experimental setup Totally, 21 datasets are used to conduct the experiment with number of features ranging from 2 to 684, number of instances from 5 to 1728 and number of class labels from 2 to 24 as listed in the Table 1. These datasets are taken from the Weka software[36] and UCI Repository[37] . The perfor- mance of the proposed algorithm FSULR is analyzed and compared with the other feature selection algorithms FS- Cor, FSChi, FSCon, FSGai-ra, ReliefF, FSUnc and FSInfo with NB, J48 and IB1 classifiers. 5 Experimental procedure, results and discussion This experiment is conducted with 21 well known publi- cally available datasets as shown in the Table 1 by 7 feature selection algorithms namely FSCor, FSChi, FSCon, FSGai- ra, ReliefF, FSUnc and FSInfo. Initially, the datasets are fed as an input to the feature selection algorithms and the selected features are obtained as output of the feature se- lection algorithms. These selected features are given to the Table 1 Datasets Serial number Dataset Number of features Number of instances Number of classes 1 Breast cancer 9 286 2 2 Contact lenses 24 5 5 3 Credit–g 20 1000 2 4 Diabetes 8 768 2 5 Glass 9 214 7 6 Ionosphere 34 351 2 7 Iris 2D 2 150 3 8 Iris 4 150 3 9 Labor 16 57 2 10 Segment challenge 19 1500 7 11 Soybean 683 36 14 12 Vote 435 17 3 13 Weather nominal 4 14 2 14 Weather numeric 14 5 2 15 Audiology 69 226 24 16 Car 6 1728 4 17 Cylinder bands 39 540 2 18 Dermatology 34 366 3 19 Ecoli 7 336 8 20 Anneal 39 898 6 21 Hay-train 10 373 9
  • 5. D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 515 NB, J48 and IB1 classifiers and the predictive accuracy and the time taken to build the predictive model are calculated with the 10-fold cross validation test mode. The performance of the proposed FSULR algorithm is analyzed in terms of predictive accuracy, time to build the model and the number of features reduced. Fig. 1 expresses the overall performance of feature selection algorithms in producing the predictive accuracy for supervised learners and it is observed that the FSLUR performs better than all the other feature selection algorithms compared. The second and third positions are retained by the FSCon and FSInfo respectively. Fig. 2 shows that the proposed FSULR achieves better accuracy than the other algorithms com- pared for NB and IB1 classifiers. The FSCon achieves bet- ter results for J48 classifier as compared to all the other classifiers. Fig. 1 Comparison of overall prediction accuracy in percentage with respective feature selection algorithm Fig. 2 Comparison of prediction accuracy in percentage with respect to the feature selection algorithm and classifier From Fig.3, it is evident that FSULR takes more time to build the predictive model compared to all other algo- rithms. From Fig.4, it is observed that the FSGai-ra and FSUnc require less time to build the model for NB classi- fier than other algorithms compared. The FSUnc takes less time to build model for J48 classifier. The FSGai-ra and ReliefF consume less time to build the predictive model for IB1 classifier. Fig.5 exhibits that ReliefF reduces the num- ber of features significantly than other algorithms compared and FSCon reduces the least number of features. Fig. 3 Comparison of overall time taken to build the predictive model (in second) with respect to the feature selection algorithm Fig. 4 Comparison of time taken to build the predictive model (in second) with respect to the feature selection algorithm and classifier Fig. 5 Comparison of number of features reduction with respect to the feature selection algorithm 6 Conclusions and future work This paper proposed a feature selection algorithm namely unsupervised learning with ranking based feature selection algorithm (FSULR). Performance of this algorithm is an-
  • 6. 516 International Journal of Automation and Computing 12(5), October 2015 alyzed in terms of accuracy produced for classifier, time taken to build predictive model and feature reduction. The performance of this algorithm is compared with other fea- ture selection algorithms namely correlation based FSCor χ2 based FSChi, consistency based FSCon, gain ratio based FSGai-ra, ReliefF, symmetric uncertainty based FSUnc and information gain based FSInfo algorithm with NB, J48, IB1 classifiers. The FSULR achieves better prediction accuracy than the other feature selection algorithms and achieves higher accuracy for IB1 and NB classifiers compared to the other feature selection algorithms. The FSULR is also con- siderably good in reducing the time to build model for IB1 compared to FSGai-ra and ReliefF. FSULR is considerably good in reducing the time to build model for J48 classi- fiers compared to FSCon algorithm. The FSULR reduces the number of features compared to FSCor, FSCon and FSGai-ra. In future, this work can be extended with other statistical measures for ranking the features within the clus- ters. References [1] J. Sinno, Q. Y. Pan. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–135, 2010. [2] M. R. Rashedur, R. M. Fazle. Using and comparing different decision tree classification techniques for mining ICDDR, B Hospital Surveillance data. Expert Systems with Applica- tions, vol. 38, no. 9, pp. 11421–11436, 2011. [3] M. Wasikowski, X. W. Chen. Combating the small sam- ple class imbalance problem using feature selection. IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1388–1400, 2010. [4] Q. B. Song, J. J. Ni, G. T. Wang. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Transactions on Knowledge and Data Engineer- ing, vol. l5, no. 1, pp. 1–14, 2013. [5] J. F. Artur, A. T. M´ario. Efficient feature selection filters for high-dimensional data. Pattern Recognition Letters, vol. 33, no. 13, pp. 1794–1804, 2012. [6] J. Wu, L. Chen, Y. P. Feng, Z. B. Zheng, M. C. Zhou, Z. H. Wu. Predicting quality of service for selection by neighborhood-based collaborative filtering. IEEE Transac- tions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 2, pp. 428–439, 2012. [7] C. P. Hou, F. P. Nie, X. Li, D. Yi, Y. Wu. Joint embedding learning and sparse regression: A framework for unsuper- vised feature selection. IEEE Transactions on Cybernetics, vol. 44, no. 6, pp. 793–804, 2014. [8] P. Bermejo, L. dela Ossa, J. A. G´amez, J. M. Puerta. Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking Original Research Article. Knowledge-based Systems, vol. 25, no. 1, pp. 35–44, 2012. [9] S. Atulji, G. Shameek, V. K. Jayaramanb. Hybrid biogeog- raphy based simultaneous feature selection and MHC class I peptide binding prediction using support vector machines and random forests. Journal of Immunological Methods, vol. 387, no. 1-2, pp. 284–292, 2013. [10] H. L. Wei, S. Billings. Feature subset selection and ranking for data dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 162–166, 2007. [11] X. V. Nguyen, B. James. Comments on supervised fea- ture selection by clustering using conditional mutual information-based distances. Pattern Recognition, vol. 46, no. 4, pp. 1220–1225, 2013. [12] A. G. Iffat, S. S. Leslie. Feature subset selection in large di- mensionality domains. Pattern Recognition, vol. 43, no. 1, pp. 5–13, 2010. [13] M. Hall. Correlation-based Feature Selection for Machine Learning, Ph. D dissertation, The University of Waikato, New Zealond, 1999. [14] Y. Lei, L. Huan. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Re- search, vol. 5, no. 1, pp. 1205–1224, 2004. [15] M. Dash, H. Liu, H. Motoda. Consistency based feature se- lection. In Proceedings of the 4th Pacific Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 98–109, 2000. [16] H. Peng, L. Fulmi, C. Ding. Feature selection based on mutual information criteria of max-dependency, max- relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226–1238, 2005. [17] H. Uˇguz. A two-stage feature selection method for text cat- egorization by using information gain, principal component analysis and genetic algorithm. Knowledge-based Systems, vol. 24, no. 7, pp. 1024–1032, 2011. [18] M. Robnik-ˇSikonja, I. Kononenko. Theoretical and empir- ical analysis of ReliefF and RReliefF. Machine Learning, vol. 53, no. 1-2, pp. 23–69, 2003. [19] S. Yijun, T. Sinisa, G. Steve. Local-learning-based feature selection for high-dimensional data analysis. IEEE Transac- tions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1610–1626, 2010. [20] W. Peng, S. Cesar, S. Edward. Prediction based on in- tegration of decisional DNA and a feature selection algo- rithm RELIEF-F. Cybernetics and Systems, vol. 44, no. 3, pp. 173–183, 2013.
  • 7. D. A. A. G. Singh et al. / An Unsupervised Feature Selection Algorithm with Feature Ranking for· · · 517 [21] H. W. Liu, J. G. Sun, L. Liu, H. J. Zhang. Feature selec- tion with dynamic mutual information. Pattern Recogni- tion, vol. 42, no. 7, pp. 1330–1339, 2009. [22] M. C. Lee. Using support vector machine with a hybrid fea- ture selection method to the stock trend prediction. Expert Systems with Applications, vol. 36, no. 8, pp. 10896–10904, 2009. [23] P. Mitra, C. A. Murthy, S. K. Pal. Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301–312, 2002. [24] J. Handl, J. Knowles. Feature subset selection in unsu- pervised learning via multi objective optimization. Inter- national Journal of Computational Intelligence Research, vol. 2, no. 3, pp. 217–238, 2006. [25] H. Liu, L. Yu. Toward integrating feature selection algo- rithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491– 502, 2005. [26] S. Garc´ıa, J. Luengo, J. A. S´aez, V. L´oez, F. Herrera. A survey of discretization techniques: Taxonomy and empir- ical analysis in supervised learning. IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 4, pp. 734– 750, 2013. [27] S. A. A. Balamurugan, R. Rajaram. Effective and efficient feature selection for large-scale data using Bayes theorem. International Journal of Automation and Computing, vol. 6, no. 1, pp. 62–71, 2009. [28] J. A. Mangai, V. S. Kumar, S. A. alias Balamurugan. A novel feature selection framework for automatic web page classification. International Journal of Automation and Computing, vol. 9, no. 4, pp. 442–448, 2012. [29] H. J. Huang, C. N. Hsu. Bayesian classification for data from the same unknown class. IEEE Transactions on Sys- tems, Man, and Cybernetics, Part B: Cybernetic, vol. 32, no. 2, pp. 137–145, 2002. [30] S. Ruggieri. Efficient C4.5. IEEE Transactions on Knowl- edge and Data Engineering, vol. 14, no. 2, pp. 438–444, 2002. [31] P. Kemal, G. Salih. A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Systems with Applications, vol. 36, no. 2, pp. 1587–1592, 2009. [32] W. W. Cheng, E. H¨ullermeier. Combining instance-based learning and logistic regression for multilabel classification. Machine Learning, vol. 76, no. 3, pp. 211–225, 2009. [33] J. H. Hsiu, P. Saumyadipta, I. L. Tsung. Maximum likeli- hood inference for mixtures of skew Student-t-normal dis- tributions through practical EM-type algorithms. Statistics and Computing, vol. 22, no. 1, pp. 287–299, 2012. [34] M. H. C. Law, M. A. T. Jain, A. K. Jain. Simultaneous fea- ture selection and clustering using mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1154–1166, 2004. [35] T. W. Lee, M. S. Lewicki, T. J. Sejnowski. ICA mix- ture models for unsupervised classification of non-gaussian classes and automatic context switching in blind signal sep- aration. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, vol. 22, no. 10, pp. 1078–1089, 2002. [36] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reute- mann. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009. [37] K. Bache, M. Lichman. UCI Machine Learning Repository, [Online], Available: http://archive.ics.uci.edu/ml, Irvine, CA: University of California, School of Information and Computer Science, 2013. Danasingh Asir Antony Gnana Singh received the M. Eng. and B. Eng. degrees from Anna University, India. He is cur- rently working as a teaching fellow in the Department of Computer Science and En- gineering, Bharathidasan Institute of Tech- nology, Anna University, India. His research interests include data min- ing, wireless networks, parallel computing, mobile computing, computer networks, image processing, soft- ware engineering, soft computing, teaching learning process and engineering education. E-mail: asirantony@gmail.com (Corresponding author) ORCID iD: 0000-0002-4913-2886 Subramanian Appavu Alias Balamu- rugan received the Ph. D. degree from Anna University, India. He is currently working as a professor and the head of the Department of Information Technology in K.L.N College of Information Technology, India. His research interests include pattern recognition, data mining and informatics. E-mail: app s@yahoo.com Epiphany Jebamalar Leavline re- ceived the M. Eng. and B. Eng. degrees from Anna University, India, and received the MBA degree from Alagappa Univer- sity, India. She is currently working as an assistant professor in the Department of Electronics and Communication Engineer- ing, Bharathidasan Institute of Technology, Anna University, India. Her research interests include image processing, signal pro- cessing, VLSI design, data mining, teaching learning process and engineering education. E-mail: jebi.lee@gmail.com