SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
138
PERFORMANCE ANALYSIS OF LEARNING AND CLASSIFICATION
ALGORITHMS
Shravan Vishwanathan1
, Thirunavukkarasu K2
1
Student M.Tech(CSE), Galgotias University, Gautam Buddh Nagar, Uttar Pradesh, India
2
Asst.Professor, Galgotias University, Gautam Buddh Nagar, Uttar Pradesh, India
ABSTRACT
There are different learning and classification algorithms that are used to learn patterns and
categorize data according to the frequency and relationship between attributes in a given dataset. The
desired result has always been higher accuracy in predicting future values or events from the given
dataset. These algorithms are crucial in the field of knowledge discovery and data mining and have
been extensively researched and improved for accuracy.
In this paper we first review the literature survey of the common learning algorithms. Then,
we analyse and compare common and popular learning algorithms like Neural Net and SVM.
Classifying algorithms like Naïve Bayes, BFT, Decision Stump have also been compared. The
analysis has been made on the basis of precision and accuracy in prediction and classification of
data. The comparison helps us know the performance of these algorithms under certain conditions.
Keywords: BFT, Classification algorithms, Data mining, Decision Stump, Learning algorithms,
Naïve Bayes, Neural Net, SVM.
1. INTRODUCTION
The branch of study that can learn from data is commonly termed as machine learning. It is a
subdivision of artificial intelligence and plays a vital role in data mining. However, machine learning
can be divided into two subcategories, representation and generalization. Representation refers to
defining all instances and functions pertaining to a particular instance. Generalization on the other
hand defines the accuracy of a machine to perform a function on unseen data after having experienced
from a learning data set. In order for a machine to learn from a given data set, it has to be trained in
order to accomplish the task. A learning model is developed based on one or more algorithms. The
model is given a training dataset to learn and an original data set to perform actual function.
Depending on the probability distribution of the data and the frequency the data can be classified and
predicted.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING &
TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 5, Issue 4, April (2014), pp. 138-149
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2014): 8.5328 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
139
Databases on the other hand have become more complex with the rise of XML trees, spatial
data and GPS temporal data. Adaptive techniques should be used in order to effectively process and
learn from these types of data sets. However, the question arises on the precision and accuracy of the
prediction and classification. The performance of learning algorithms varies on the same dataset input
and on similar configuration.
In this paper we have used a single dataset as input to compare the performance with learning
algorithms in the same category. The results contrast to a certain extent. This gives us an insight on
the effectiveness on learning techniques with preferable configuration. The learning techniques often
use the concept of data window which basically limits the amount of data to be processed based on
different characteristics. The sensitive part is to know how much data to be processed in order to
produce optimum results. Data window comprises of four types: fixed sliding window, adaptive
window, landmark window, damped window. The preferred data window is the adaptive window
which resizes dynamically on incoming input data.
To process and represent complex input and output relations neural network model proves to
be a powerful model. Its major ability is to represent both linear and non-linear relationships form the
model data. The neural net model is first trained using a set of data, after which it is capable of
performing classification, prediction and analysis.
The Artificial Neural Network [1] is one of the field of study that gains knowledge by
mapping and learning data similar to the human brain. The network is interconnected through various
nodes known as neurons. However, the performance of the ANN solely depends on the type of
network and the number of neurons present in the network. The fewer the number of neurons the
higher is the performance of the system.
Support Vector Machine (SVM) [2] has also gained major importance in learning theory
because of its ability to represent non-linear relationships efficiently. It focuses on statistical learning
theory. SVM maps the data Y into a feature space M of high dimensionality. This technique can be
applied to both classification and regression. In the case of classification an efficient hyperplane is
found that separates the data in two different classes. Whereas, in the case of regression a hyperplane
is constructed which is close to maximum number of data points.
We have taken the issue of weather forecasting to compare Neural net and SVM. The
challenge behind weather forecasting [3] is the unpredictability and dynamic change in the
atmospheric temperature. There is no particular pattern that can be modelled to predict the
temperature from historic data. However, multiple experiments have resulted that neural net and SVM
[4], [5] have the capability to model and capture complex non- linear relationships that contribute to a
particular temperature.
Naïve Bayes classifier [6] is based on Bayes theorem. It simplifies the learning technique by
assuming that the features in a particular class are independent. Although, this may not be an optimal
method for classification generally, but it proves to be a competent algorithm compared to other
sophisticated classification techniques. Naïve Bayes has been successful in applications concerning
text classification, system performance management and medical diagnosis.
Best First Decision tree (BFT) [7] works with the concept of expanding nodes in best first
order rather than a fixed order.
Decision stump [8] is a one level decision tree with only a single split. It means there is only one
internal node which is connected to leaf nodes. Therefore, a prediction is made based on a single
feature of the data set.
In this paper we have compared and analysed classification and learning algorithms in
Rapidminer 5. Rapidminer [9] is a powerful open -source tool for data mining, analysis and
simulation. The tool provides the user with an environment of rapid application development and
appropriate data visualizations.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
140
2. ANN FRAMEWORK
Feed forward neural network using back propagation algorithm [10] is generally used for
prediction and classification. Back propagation algorithm is used for simple pattern recognition and
mapping. Consider a hidden layer neuron X and output neuron Y and a connection weight WAB
between them. There is also a connecting link between neuron A and Z. The algorithm would
function as follows:
1. The initial weight being random number, provide the system with appropriate inputs and
process the output.
2. Calculate the error for neuron B.
Error Y = Output Y (1-Output Y) (Target Y – Output Y)
3. The weight is adjusted. Let W+
XY be the trained weight and WXY be the initial weight.
W+
XY = WXY + (Error Y * Error X)
4. Calculate the errors for the hidden layer neurons. We Back propagate them from the output
layer. This is done by taking the errors from the output neurons and running them back through
the weights to get the hidden layer errors.
Error X = Output X (1 - Output X) (Error Y WXY + Error Z WXZ)
5. After calculating the errors of the hidden layer neurons, go to step 3 and repeat the process in
order to train the network.
3. SUPPORT VECTOR MACHINE (SVM)
A standard SVM [11] basically takes an input data and predicts which of the two classes
would comprise the input. Support vectors are basically essential training tuples and margins
(defines by these support vectors). Therefore this makes the SVM a non-probablistic binary linear
classifier. It uses non-linear mapping to transform the training data points into a higher dimensional
feature space. SVM constructs hyperplanes in a high dimensional space which can be used for
classification. An optimal classification can be achieved when the distance between the hyperplane
and closest training data points is largest. The reason is low generalization error of the classifier.
Once we have trained the support vector machine, the classification of data is done on the
basis of Lagrangarian formulation. The maximum distance hyperplane boundary can be achieved by
l
d(AT
) = ∑ yi αi Ai AT +c0
i=1
where yi is the class label for the support vector Ai. AT is the test tuple. αi (Lagrangarian multiplier)
and c0 are numeric parameters which are determined by the SVM algorithm an l is the number of
support vectors.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
141
4. NAÏVE BAYES CLASSIFIER
Naïve Bayes [12] is a simple problimistic classifier based on Bayesian theorem. It assumes
independent functions or features of the data set. The feature in a class is independent of the
presence and absence of any other feature. The main advantage of this classifier is that it requires a
small amount of training data in order to calculate the mean and variance of the variable for
classification. The variances of the label are determined rather than the entire covariance matrix
since the features of the class are independent of each other.
Let A be a data tuple and S be a hypothesis such that data tuple A belongs to a class C. In order to
classify, P(S|A) has to be determined.
P(S|A) = P(A|S) P(S)
P(A)
5. DECISION STUMP AND BEST FIRST TREE
A decision tree is a flowchart branched structure. Each internal node represents the test on
the attribute. The branch nodes represent the outcome of the test and the leaf nodes hold the class
label.
Given a tuple T, with an unknown class label, the attribute values are tested with respect to
the decision tree. The traced path from the root node to the leaf node holds the prediction of the tuple.
The popularity of the decision tree is based on the construction of tree classifiers without any domain
knowledge or any significant parameter setting.
Decision stump is a one-level decision tree with a single split. There is only one internal node
and two leaf nodes. Prediction is based on a single input feature. Although, it is a weak learning
technique, but it proves to be a fast classification technique when used with boosting [13].
BFT works similar to a standard decision tree in the depth first expansion, but the technique
helps us know new tree pruning methods using cross validation. Best first tree order is used for tree
construction rather than a predefined fixed order. The best node maximally reduces impurity among
other nodes available for splitting (i.e. non labelled leaf nodes).
6. EXPERIMENT AND RESULT
We have performed analysis, simulation and forecast performance evaluation for the leaning
algorithms in Rapidminer 5.0 [14]. It is an efficient open-source software used for modelling and
validating various classification, leaning and rule based algorithms. The Rapid Application
Development environment and graphical user interface with flow design makes modelling and
simulation easier Rapidminer. The internal process and functions are also and described in XML.
We first compare the performance of Neural net and SVM.
The dataset considered consists of average annual and monthly temperature of India from the year
1901 to the year 2012. The dataset is downloaded from the Delhi Meteorological department website
[15].
There are six attributes in the dataset, YEAR, ANNUAL, JAN-FEB, MAR-MAY, JUN-SEP,
OCT-DEC. The attributes ANNUAL, consists the average annual temperature. Attributes JAN-FEB,
MAR-MAY, JUN-SEP, OCT-DEC are the average annual monthly temperature.
6.1 Neural Net Learning model
The following is the neural net model designed in Rapidminer
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
Fig. 1: Model design for Neural Net training in Rapidminer
The input to the model is given in xls
network in order for the system to learn from the given data.
The set role operator changes the role of one or more attributes. In this case we have assigned
the YEAR attribute as an ID and ANNUAL
examples into a new single valued example set. The parameters consist of series representation,
window size and step size. The series representation is set to the default value (encode series by
example). The window size gives the width of the window or the number of examples considered at
a time. Window size is set to a value of 2. The step size is basically the gap between the first values.
The step size is set to one (default value).
Next, the sliding window validator is used to encapsulate the training and test window
examples in order to estimate the performance of a prediction operator. The parameters consist of
training window width, training window step size and the test window width and the
training window width and test window width is set to 7 and the training window size and the step
size is set to 1. The horizon gives us the incremented value from first to the last example. In this case
horizon value 1 means the prediction o
temperature.
The validation operator consists of 2 phases, training and testing .The training phase contains
the neural net operator to learn from the data examples and the testing phase applies
results the average performance of the system.
The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles.
The following was the analysis graph of the predicted and the observed annual temperature.
Fig. 2: Comparison graph of predicted tempe
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
142
Model design for Neural Net training in Rapidminer
The input to the model is given in xls format and there is also a training dataset given to the
network in order for the system to learn from the given data.
The set role operator changes the role of one or more attributes. In this case we have assigned
the YEAR attribute as an ID and ANNUAL as the label. The windowing operator transforms a set of
examples into a new single valued example set. The parameters consist of series representation,
window size and step size. The series representation is set to the default value (encode series by
mple). The window size gives the width of the window or the number of examples considered at
a time. Window size is set to a value of 2. The step size is basically the gap between the first values.
The step size is set to one (default value).
liding window validator is used to encapsulate the training and test window
examples in order to estimate the performance of a prediction operator. The parameters consist of
training window width, training window step size and the test window width and the
training window width and test window width is set to 7 and the training window size and the step
size is set to 1. The horizon gives us the incremented value from first to the last example. In this case
horizon value 1 means the prediction of the next example, i.e the prediction for the next average
The validation operator consists of 2 phases, training and testing .The training phase contains
the neural net operator to learn from the data examples and the testing phase applies
results the average performance of the system.
The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles.
The following was the analysis graph of the predicted and the observed annual temperature.
Comparison graph of predicted temperature by neural Net operator and the original Annual
temperature
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
format and there is also a training dataset given to the
The set role operator changes the role of one or more attributes. In this case we have assigned
as the label. The windowing operator transforms a set of
examples into a new single valued example set. The parameters consist of series representation,
window size and step size. The series representation is set to the default value (encode series by
mple). The window size gives the width of the window or the number of examples considered at
a time. Window size is set to a value of 2. The step size is basically the gap between the first values.
liding window validator is used to encapsulate the training and test window
examples in order to estimate the performance of a prediction operator. The parameters consist of
training window width, training window step size and the test window width and the horizon. The
training window width and test window width is set to 7 and the training window size and the step
size is set to 1. The horizon gives us the incremented value from first to the last example. In this case
f the next example, i.e the prediction for the next average
The validation operator consists of 2 phases, training and testing .The training phase contains
the neural net operator to learn from the data examples and the testing phase applies the model and
The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles.
The following was the analysis graph of the predicted and the observed annual temperature.
operator and the original Annual
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
There were some changes found in the forecasting performance of the neural net model when
we tuned the learning rate of the neural net operator.
Learning rate
0.3 (default)
Table 1: The table projects the learning rate and the corresponding forecasting performance of the
The observed forecasting performance was found to follow a linear trend on changing the
learning rate of the neural net. The performance of the system although decreased minutely (from
0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the
performance variation with respect to the learning rate.
Fig. 3: Learning rate graph for the Neural net model.
6.1 SVM learning model
SVM, initially introduced by Vapnik is basically partitioned into two categories i.e
Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and
pattern recognition functions from the support vector we make use of Support Vector Regression.
The main aim is to minimize the generalization error boun
performance. In order to understand the learning mechanism in detail and its origin one can refer to
[16].
The overall model of the SVM learner is similar as depicted in Fig.1. But the validation
phase holds the SVM regression operator instead of the Neural Net operator. The setup for the
general model is same as discussed earlier in section VI. The SVM operator is although tuned on
certain parameters. The kernel function of the operator is set to dot type. The kernel func
actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is
defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C
constant) depicts the tolerance level for the misc
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
143
There were some changes found in the forecasting performance of the neural net model when
we tuned the learning rate of the neural net operator.
earning rate Forecasting
performance (%)
0.3 (default) 67.9
0.4 67.7
0.5 67.5
0.6 68.4
0.7 68.2
0.8 68.7
The table projects the learning rate and the corresponding forecasting performance of the
Neural Net model
performance was found to follow a linear trend on changing the
learning rate of the neural net. The performance of the system although decreased minutely (from
0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the
formance variation with respect to the learning rate.
Learning rate graph for the Neural net model.
SVM, initially introduced by Vapnik is basically partitioned into two categories i.e
Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and
pattern recognition functions from the support vector we make use of Support Vector Regression.
The main aim is to minimize the generalization error bound in order to achieve generalized
performance. In order to understand the learning mechanism in detail and its origin one can refer to
The overall model of the SVM learner is similar as depicted in Fig.1. But the validation
sion operator instead of the Neural Net operator. The setup for the
general model is same as discussed earlier in section VI. The SVM operator is although tuned on
certain parameters. The kernel function of the operator is set to dot type. The kernel func
actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is
defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C
constant) depicts the tolerance level for the misclassification. Higher values for this parameter lead
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
There were some changes found in the forecasting performance of the neural net model when
The table projects the learning rate and the corresponding forecasting performance of the
performance was found to follow a linear trend on changing the
learning rate of the neural net. The performance of the system although decreased minutely (from
0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the
SVM, initially introduced by Vapnik is basically partitioned into two categories i.e Support
Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and
pattern recognition functions from the support vector we make use of Support Vector Regression.
d in order to achieve generalized
performance. In order to understand the learning mechanism in detail and its origin one can refer to
The overall model of the SVM learner is similar as depicted in Fig.1. But the validation
sion operator instead of the Neural Net operator. The setup for the
general model is same as discussed earlier in section VI. The SVM operator is although tuned on
certain parameters. The kernel function of the operator is set to dot type. The kernel function
actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is
defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C
lassification. Higher values for this parameter lead
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
to softer boundaries and lower values lead to harder boundaries. The parameter has to be set
carefully since values above optimum level can result in over generalization of data points.
The C constant has been set to a value of 0.1 from the default value of 0.0. We observed that
on increasing the value of C constant the forecasting performance slightly reduced due to over
generalization. However tuning the value of C higher than 0.2 the forecasting perfo
declined but followed a linear trend.
C constant
0.1
0.2
0.3
0.4
0.5
0.6
Table 2: Table depicts the C constant and the forecasting performance of the SVM model
Fig. 4: Forecasting performance graph with respect to C constant in SVM model
On execution of the model with a C constant tuned to 0.1, the forecasting performance was
observed to be 71.5% which is higher than the highest performance of the neural net operator.
The following trend was observed for SVR prediction function
Fig. 5: Comparison graph between predicted temperature of SVM and the actual Annual temperature
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
144
to softer boundaries and lower values lead to harder boundaries. The parameter has to be set
carefully since values above optimum level can result in over generalization of data points.
as been set to a value of 0.1 from the default value of 0.0. We observed that
on increasing the value of C constant the forecasting performance slightly reduced due to over
generalization. However tuning the value of C higher than 0.2 the forecasting perfo
C constant Forecasting
performance (%)
0.1 71.5
0.2 69.1
0.3 69.1
0.4 68.9
0.5 68.7
0.6 68.4
Table depicts the C constant and the forecasting performance of the SVM model
Forecasting performance graph with respect to C constant in SVM model
On execution of the model with a C constant tuned to 0.1, the forecasting performance was
observed to be 71.5% which is higher than the highest performance of the neural net operator.
The following trend was observed for SVR prediction function
Comparison graph between predicted temperature of SVM and the actual Annual temperature
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
to softer boundaries and lower values lead to harder boundaries. The parameter has to be set
carefully since values above optimum level can result in over generalization of data points.
as been set to a value of 0.1 from the default value of 0.0. We observed that
on increasing the value of C constant the forecasting performance slightly reduced due to over
generalization. However tuning the value of C higher than 0.2 the forecasting performance slightly
Table depicts the C constant and the forecasting performance of the SVM model
On execution of the model with a C constant tuned to 0.1, the forecasting performance was
observed to be 71.5% which is higher than the highest performance of the neural net operator.
Comparison graph between predicted temperature of SVM and the actual Annual temperature
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
However, the prediction may not be very accurate in determining the average temperature
due to lack of a proper pattern and unpredictability of temperature change. But the mapping of non
linear data into a higher dimensional space and the ability to determine patterns is more proficient in
SVM than Neural Net.
6.2 Naïve Bayes model
For the comparison of classification model we have selected the default dataset available in
Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean
classification.
The dataset consists of 5 attributes Play, Outlook, Temp
attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values
overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to
the dataset, if the day is windy there will be no game that particular day (Play=No if Wind=True).
The attributes which are to be classified are first selected from the dataset. In this case the attributes
Wind and Outlook are selected in order to identify and classify the trend. The algori
Bayes processes and calculates the probability for all labels but it selects only maximum valued label
from the dataset. For instance the calculation for label= yes would be
probability of label = yes (i.e. 9/14) value fr
yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the
answer = 9/14*0.331*0.333 = 0.071.
The Naïve Bayes model design is depicted below
Fig. 6: Naïve
The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator
processes it and the resulting model is applied on the ‘Golf
after the I Bayes operator. In the given dataset 9 out of 14 examples of the training set have label =
yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the
label = no is 5/14. However, in the testing set, the attributes of t
and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the
confusion matrix [17] depicts the actual and predicted classification done by the classification
operator.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
145
However, the prediction may not be very accurate in determining the average temperature
lack of a proper pattern and unpredictability of temperature change. But the mapping of non
linear data into a higher dimensional space and the ability to determine patterns is more proficient in
For the comparison of classification model we have selected the default dataset available in
Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean
The dataset consists of 5 attributes Play, Outlook, Temperature, Humidity, Wind. The
attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values
overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to
ere will be no game that particular day (Play=No if Wind=True).
The attributes which are to be classified are first selected from the dataset. In this case the attributes
Wind and Outlook are selected in order to identify and classify the trend. The algori
Bayes processes and calculates the probability for all labels but it selects only maximum valued label
from the dataset. For instance the calculation for label= yes would be following: posterior
probability of label = yes (i.e. 9/14) value from distribution table when Outlook = rain and label =
yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the
answer = 9/14*0.331*0.333 = 0.071.
The Naïve Bayes model design is depicted below
Naïve Bayes process model in Rapidminer.
The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator
processes it and the resulting model is applied on the ‘Golf-testset’ data set. A breakpoint is inserted
ator. In the given dataset 9 out of 14 examples of the training set have label =
yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the
label = no is 5/14. However, in the testing set, the attributes of the first example are Outlook = sunny
and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the
confusion matrix [17] depicts the actual and predicted classification done by the classification
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
However, the prediction may not be very accurate in determining the average temperature
lack of a proper pattern and unpredictability of temperature change. But the mapping of non-
linear data into a higher dimensional space and the ability to determine patterns is more proficient in
For the comparison of classification model we have selected the default dataset available in
Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean
erature, Humidity, Wind. The
attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values
overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to
ere will be no game that particular day (Play=No if Wind=True).
The attributes which are to be classified are first selected from the dataset. In this case the attributes
Wind and Outlook are selected in order to identify and classify the trend. The algorithm for Naïve
Bayes processes and calculates the probability for all labels but it selects only maximum valued label
following: posterior
om distribution table when Outlook = rain and label =
yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the
The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator
testset’ data set. A breakpoint is inserted
ator. In the given dataset 9 out of 14 examples of the training set have label =
yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the
he first example are Outlook = sunny
and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the
confusion matrix [17] depicts the actual and predicted classification done by the classification
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
Fig. 8: Performance vector statistics for Naïve Bayes process model.
The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic
can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs
surprisingly well assuming that labels of class are independent. But, the performance of the
algorithm is appreciable in case when the training data is small.
6.3 Best First decision Tree model (BFT)
The design of BFT model is similar to the Naïve Bayes
used Golf dataset.
Fig. 7:
BFT operator is a class for building a best
split for both nominal and numeric attributes. For m
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
146
ormance vector statistics for Naïve Bayes process model.
The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic
can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs
risingly well assuming that labels of class are independent. But, the performance of the
algorithm is appreciable in case when the training data is small.
Best First decision Tree model (BFT)
The design of BFT model is similar to the Naïve Bayes model. The input dataset is the earlier
BFT process model in Rapidminer.
BFT operator is a class for building a best-first decision tree classifier. This class uses binary
split for both nominal and numeric attributes. For missing values, fractional instance method is used.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ormance vector statistics for Naïve Bayes process model.
The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic
can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs
risingly well assuming that labels of class are independent. But, the performance of the
model. The input dataset is the earlier
first decision tree classifier. This class uses binary
issing values, fractional instance method is used.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp.
Fig. 9: Performance vector statistics for Best
6.4 Decision stump model
Our next experiment is to compare decision stump. A decision stump is basically a
decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a
single input feature.
In this experiment we take a sample dataset from Rapidminer called Labour
The dataset consists of 17 attributes with some missing values. The decision tree splits the table into
2 classes good and bad according to the wage, long
statutory holidays. The minimal gain i.e the parameter that controls the size of the
value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this
parameter can lead to very few splits but can reduce the performance of the classification. When we
split the dataset in the form of a decisi
following decision tree is formed
Fig. 10: Decision tree structure for Labour
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
147
Performance vector statistics for Best-First decision tree process model.
Our next experiment is to compare decision stump. A decision stump is basically a
decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a
In this experiment we take a sample dataset from Rapidminer called Labour
tributes with some missing values. The decision tree splits the table into
2 classes good and bad according to the wage, long-term disability assistance, working hours and
statutory holidays. The minimal gain i.e the parameter that controls the size of the
value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this
parameter can lead to very few splits but can reduce the performance of the classification. When we
split the dataset in the form of a decision tree with a limit of 4 splits and 2 minimal leaves, the
Decision tree structure for Labour-Negotiations dataset.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
First decision tree process model.
Our next experiment is to compare decision stump. A decision stump is basically a one-level
decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a
In this experiment we take a sample dataset from Rapidminer called Labour-Negotiations.
tributes with some missing values. The decision tree splits the table into
term disability assistance, working hours and
statutory holidays. The minimal gain i.e the parameter that controls the size of the tree is set to a
value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this
parameter can lead to very few splits but can reduce the performance of the classification. When we
on tree with a limit of 4 splits and 2 minimal leaves, the
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
148
Now, feeding the same dataset to the decision stump operator would result in the following
decision tree
Fig. 11: Decision stump structure for the Labour-negotiations dataset. The two classes defined are
‘good’ and ‘bad’
In the first decision tree the wage-inc-1st
is classified into various levels of long-term-
disability, working hours, statutory holidays. The classification of labels is performed according to
values that are greater, equal or smaller to a certain limit given in the dataset.
The decision stump clearly generalizes, gives the prediction and classifies the data points to
predefined classes. The classification is done on the basis of an input feature. The input feature can
be selected by the user or it can be suggested by other learning techniques. This classification
technique can be used in association with other learning techniques to analyse, predict and classify
time series and patterns. This technique is much faster, generalized and easy to implement compared
to other types of decision trees.
7. CONCLUSION
We have performed experiments and compared learning algorithms including Neural Net and
SVM .The results portrayed that both Neural Net and SVM although are capable to model both
linear and non-linear datasets. The Neural Net is more proficient when it comes to model complex
datasets whereas SVM is more successful in classifying and modelling non- linear relationships. The
mapping of features into a higher dimensional space is efficient yet useful while predicting and
recognizing patterns in SVM. The performance of SVM was found to be better than Neural Net
while predicting. Naïve Bayes is a simple classifier based on Bayes theorem. It is simple to
implement and proves to be successful and competent against various classification algorithms. The
accuracy of Naïve Bayes was slightly lower than BFT. Naïve Bayes though is easy to implement and
efficient, it can sometimes deviate from optimum performance when the size of the dataset
exceptionally increases. Decision Stump is perhaps a generalizing tree structure that uses a single
node and 2 leaves to represent and classify data points. It proves to be quite efficient if the feature
selection by the user is apt for optimum classification. These algorithms, if allowed to perform in the
right combination can be more efficient in learning, feature selection and classification.
In the future we plan to analyse and optimize learning algorithms for complex and fuzzy
datasets. We also plan to provide a framework for learning and decision making from complex
datasets.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print),
ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME
149
REFERENCES
1. Lippmann, Richard. "An introduction to computing with neural nets." ASSP Magazine,
IEEE 4.2 (1987): 4-22.
2. Hearst, Marti A., et al. "Support vector machines." Intelligent Systems and their
Applications, IEEE 13.4 (1998): 18-28.
3. Firth, W. J. "Chaos--predicting the unpredictable." BMJ: British Medical Journal303.6817
(1991): 1565.
4. Baboo, S. Santhosh, and I. Kadar Shereef. "An efficient weather forecasting system using
artificial neural network." International journal of environmental science and development
1.4 (2010).
5. Radhika, Y., and M. Shashi. "Atmospheric temperature prediction using support vector
machines." International Journal of Computer Theory and Engineering 1.1 (2009).
6. Rish, Irina. "An empirical study of the naive Bayes classifier." IJCAI 2001 workshop on
empirical methods in artificial intelligence. Vol. 3. No. 22. 2001. FLEXChip Signal
Processor (MC68175/D), Motorola, 1996.
7. Shi, Haijian. Best-first decision tree learning. Diss. The University of Waikato, 2007.
8. Witten, Ian H., et al. "Weka: Practical machine learning tools and techniques with Java
implementations." (1999).
9. RapidMiner, RapidMiner. "Open Source Data Mining." (2009).
10. White, Halbert. "Learning in artificial neural networks: A statistical
perspective."Neural computation 1.4 (1989): 425-464.
11. Hall, Mark, et al. "The WEKA data mining software: an update." ACM SIGKDD
Explorations Newsletter 11.1 (2009) : 10-18.
12. Bayes, Naïve. "Naïve Bayes Classifier."
13. Zhao, Yongheng, and Yanxia Zhang. "Comparison of decision tree methods for finding
active objects." Advances in Space Research 41.12 (2008): 1955-1959.
14. Land, Sebastian, and Simon Fischer. "RapidMiner 5."
15. http://data.gov.in/dataset/annual-and-seasonal-maximum-temperature-india.
16. Cortes, Corinna, and Vladimir Vapnik. "Support vector machine." Machine learning 20.3
(1995): 273-297.
17. "Confusion Matrix." Confusion Matrix. 16 Oct. 2013
<http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html.
18. Dr. Rajeshwari S. Mathad, “Supervised Learning in Artificial Neural Networks”,
International Journal of Advanced Research in Engineering & Technology (IJARET),
Volume 5, Issue 3, 2014, pp. 208 - 215, ISSN Print: 0976-6480, ISSN Online: 0976-6499.
19. Sandip S. Patil and Asha P. Chaudhari, “Classification of Emotions from Text using SVM
Based Opinion Mining”, International Journal of Computer Engineering & Technology
(IJCET), Volume 3, Issue 1, 2012, pp. 330 - 338, ISSN Print: 0976 – 6367, ISSN Online:
0976 – 6375.
20. V.Anandhi and Dr.R.Manicka Chezian, “Comparison of the Forecasting Techniques –
ARIMA, ANN and SVM - A Review”, International Journal of Computer Engineering &
Technology (IJCET), Volume 4, Issue 3, 2013, pp. 370 - 376, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.
21. Tarun Dhar Diwan and Upasana Sinha, “The Machine Learning Method Regarding Efficient
Soft Computing and ICT using SVM”, International Journal of Computer Engineering &
Technology (IJCET), Volume 4, Issue 1, 2013, pp. 124 - 130, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.

Weitere ähnliche Inhalte

Was ist angesagt?

Comparison of the forecasting techniques – arima, ann and svm a review-2
Comparison of the forecasting techniques – arima, ann and svm   a review-2Comparison of the forecasting techniques – arima, ann and svm   a review-2
Comparison of the forecasting techniques – arima, ann and svm a review-2IAEME Publication
 
Performance analysis of neural network models for oxazolines and oxazoles der...
Performance analysis of neural network models for oxazolines and oxazoles der...Performance analysis of neural network models for oxazolines and oxazoles der...
Performance analysis of neural network models for oxazolines and oxazoles der...ijistjournal
 
Improved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm dataImproved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm dataISA Interchange
 
Detection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachDetection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachEditor IJMTER
 
A Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationA Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationIJECEIAES
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsVidya sagar Sharma
 
Novel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data StreamsNovel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data StreamsIJERA Editor
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...Happiest Minds Technologies
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
 
Big Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- ReduceBig Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- Reduceijircee
 
A Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano ClassifierA Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano Classifierjournal ijrtem
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXmlaij
 
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...sherinmm
 

Was ist angesagt? (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
K044065257
K044065257K044065257
K044065257
 
Comparison of the forecasting techniques – arima, ann and svm a review-2
Comparison of the forecasting techniques – arima, ann and svm   a review-2Comparison of the forecasting techniques – arima, ann and svm   a review-2
Comparison of the forecasting techniques – arima, ann and svm a review-2
 
Performance analysis of neural network models for oxazolines and oxazoles der...
Performance analysis of neural network models for oxazolines and oxazoles der...Performance analysis of neural network models for oxazolines and oxazoles der...
Performance analysis of neural network models for oxazolines and oxazoles der...
 
Improved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm dataImproved correlation analysis and visualization of industrial alarm data
Improved correlation analysis and visualization of industrial alarm data
 
Detection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed ApproachDetection of Outliers in Large Dataset using Distributed Approach
Detection of Outliers in Large Dataset using Distributed Approach
 
StockMarketPrediction
StockMarketPredictionStockMarketPrediction
StockMarketPrediction
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
A Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity PropagationA Preference Model on Adaptive Affinity Propagation
A Preference Model on Adaptive Affinity Propagation
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
Novel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data StreamsNovel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data Streams
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
 
Big Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- ReduceBig Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- Reduce
 
A Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano ClassifierA Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano Classifier
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
 

Ähnlich wie 50120140504015

A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...IJECEIAES
 
Comparison of the forecasting techniques – arima, ann and svm a review-2
Comparison of the forecasting techniques – arima, ann and svm   a review-2Comparison of the forecasting techniques – arima, ann and svm   a review-2
Comparison of the forecasting techniques – arima, ann and svm a review-2IAEME Publication
 
Neural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabNeural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabijcses
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsIRJET Journal
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...IJECEIAES
 
Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Drjabez
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringIRJET Journal
 
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...sherinmm
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET Journal
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan forijaia
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkGauravPandey319
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesAlexander Decker
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
 

Ähnlich wie 50120140504015 (20)

61_Empirical
61_Empirical61_Empirical
61_Empirical
 
A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...
 
Comparison of the forecasting techniques – arima, ann and svm a review-2
Comparison of the forecasting techniques – arima, ann and svm   a review-2Comparison of the forecasting techniques – arima, ann and svm   a review-2
Comparison of the forecasting techniques – arima, ann and svm a review-2
 
40120130405012
4012013040501240120130405012
40120130405012
 
Neural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabNeural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlab
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique Algorithms
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
 
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...
 
Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools Anomaly detection by using CFS subset and neural network with WEKA tools
Anomaly detection by using CFS subset and neural network with WEKA tools
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
 
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
 
50120140505015 2
50120140505015 250120140505015 2
50120140505015 2
 
Dataminng
DataminngDataminng
Dataminng
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan for
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machines
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
 

Mehr von IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Mehr von IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Kürzlich hochgeladen

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Kürzlich hochgeladen (20)

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

50120140504015

  • 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 138 PERFORMANCE ANALYSIS OF LEARNING AND CLASSIFICATION ALGORITHMS Shravan Vishwanathan1 , Thirunavukkarasu K2 1 Student M.Tech(CSE), Galgotias University, Gautam Buddh Nagar, Uttar Pradesh, India 2 Asst.Professor, Galgotias University, Gautam Buddh Nagar, Uttar Pradesh, India ABSTRACT There are different learning and classification algorithms that are used to learn patterns and categorize data according to the frequency and relationship between attributes in a given dataset. The desired result has always been higher accuracy in predicting future values or events from the given dataset. These algorithms are crucial in the field of knowledge discovery and data mining and have been extensively researched and improved for accuracy. In this paper we first review the literature survey of the common learning algorithms. Then, we analyse and compare common and popular learning algorithms like Neural Net and SVM. Classifying algorithms like Naïve Bayes, BFT, Decision Stump have also been compared. The analysis has been made on the basis of precision and accuracy in prediction and classification of data. The comparison helps us know the performance of these algorithms under certain conditions. Keywords: BFT, Classification algorithms, Data mining, Decision Stump, Learning algorithms, Naïve Bayes, Neural Net, SVM. 1. INTRODUCTION The branch of study that can learn from data is commonly termed as machine learning. It is a subdivision of artificial intelligence and plays a vital role in data mining. However, machine learning can be divided into two subcategories, representation and generalization. Representation refers to defining all instances and functions pertaining to a particular instance. Generalization on the other hand defines the accuracy of a machine to perform a function on unseen data after having experienced from a learning data set. In order for a machine to learn from a given data set, it has to be trained in order to accomplish the task. A learning model is developed based on one or more algorithms. The model is given a training dataset to learn and an original data set to perform actual function. Depending on the probability distribution of the data and the frequency the data can be classified and predicted. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2014): 8.5328 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 139 Databases on the other hand have become more complex with the rise of XML trees, spatial data and GPS temporal data. Adaptive techniques should be used in order to effectively process and learn from these types of data sets. However, the question arises on the precision and accuracy of the prediction and classification. The performance of learning algorithms varies on the same dataset input and on similar configuration. In this paper we have used a single dataset as input to compare the performance with learning algorithms in the same category. The results contrast to a certain extent. This gives us an insight on the effectiveness on learning techniques with preferable configuration. The learning techniques often use the concept of data window which basically limits the amount of data to be processed based on different characteristics. The sensitive part is to know how much data to be processed in order to produce optimum results. Data window comprises of four types: fixed sliding window, adaptive window, landmark window, damped window. The preferred data window is the adaptive window which resizes dynamically on incoming input data. To process and represent complex input and output relations neural network model proves to be a powerful model. Its major ability is to represent both linear and non-linear relationships form the model data. The neural net model is first trained using a set of data, after which it is capable of performing classification, prediction and analysis. The Artificial Neural Network [1] is one of the field of study that gains knowledge by mapping and learning data similar to the human brain. The network is interconnected through various nodes known as neurons. However, the performance of the ANN solely depends on the type of network and the number of neurons present in the network. The fewer the number of neurons the higher is the performance of the system. Support Vector Machine (SVM) [2] has also gained major importance in learning theory because of its ability to represent non-linear relationships efficiently. It focuses on statistical learning theory. SVM maps the data Y into a feature space M of high dimensionality. This technique can be applied to both classification and regression. In the case of classification an efficient hyperplane is found that separates the data in two different classes. Whereas, in the case of regression a hyperplane is constructed which is close to maximum number of data points. We have taken the issue of weather forecasting to compare Neural net and SVM. The challenge behind weather forecasting [3] is the unpredictability and dynamic change in the atmospheric temperature. There is no particular pattern that can be modelled to predict the temperature from historic data. However, multiple experiments have resulted that neural net and SVM [4], [5] have the capability to model and capture complex non- linear relationships that contribute to a particular temperature. Naïve Bayes classifier [6] is based on Bayes theorem. It simplifies the learning technique by assuming that the features in a particular class are independent. Although, this may not be an optimal method for classification generally, but it proves to be a competent algorithm compared to other sophisticated classification techniques. Naïve Bayes has been successful in applications concerning text classification, system performance management and medical diagnosis. Best First Decision tree (BFT) [7] works with the concept of expanding nodes in best first order rather than a fixed order. Decision stump [8] is a one level decision tree with only a single split. It means there is only one internal node which is connected to leaf nodes. Therefore, a prediction is made based on a single feature of the data set. In this paper we have compared and analysed classification and learning algorithms in Rapidminer 5. Rapidminer [9] is a powerful open -source tool for data mining, analysis and simulation. The tool provides the user with an environment of rapid application development and appropriate data visualizations.
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 140 2. ANN FRAMEWORK Feed forward neural network using back propagation algorithm [10] is generally used for prediction and classification. Back propagation algorithm is used for simple pattern recognition and mapping. Consider a hidden layer neuron X and output neuron Y and a connection weight WAB between them. There is also a connecting link between neuron A and Z. The algorithm would function as follows: 1. The initial weight being random number, provide the system with appropriate inputs and process the output. 2. Calculate the error for neuron B. Error Y = Output Y (1-Output Y) (Target Y – Output Y) 3. The weight is adjusted. Let W+ XY be the trained weight and WXY be the initial weight. W+ XY = WXY + (Error Y * Error X) 4. Calculate the errors for the hidden layer neurons. We Back propagate them from the output layer. This is done by taking the errors from the output neurons and running them back through the weights to get the hidden layer errors. Error X = Output X (1 - Output X) (Error Y WXY + Error Z WXZ) 5. After calculating the errors of the hidden layer neurons, go to step 3 and repeat the process in order to train the network. 3. SUPPORT VECTOR MACHINE (SVM) A standard SVM [11] basically takes an input data and predicts which of the two classes would comprise the input. Support vectors are basically essential training tuples and margins (defines by these support vectors). Therefore this makes the SVM a non-probablistic binary linear classifier. It uses non-linear mapping to transform the training data points into a higher dimensional feature space. SVM constructs hyperplanes in a high dimensional space which can be used for classification. An optimal classification can be achieved when the distance between the hyperplane and closest training data points is largest. The reason is low generalization error of the classifier. Once we have trained the support vector machine, the classification of data is done on the basis of Lagrangarian formulation. The maximum distance hyperplane boundary can be achieved by l d(AT ) = ∑ yi αi Ai AT +c0 i=1 where yi is the class label for the support vector Ai. AT is the test tuple. αi (Lagrangarian multiplier) and c0 are numeric parameters which are determined by the SVM algorithm an l is the number of support vectors.
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 141 4. NAÏVE BAYES CLASSIFIER Naïve Bayes [12] is a simple problimistic classifier based on Bayesian theorem. It assumes independent functions or features of the data set. The feature in a class is independent of the presence and absence of any other feature. The main advantage of this classifier is that it requires a small amount of training data in order to calculate the mean and variance of the variable for classification. The variances of the label are determined rather than the entire covariance matrix since the features of the class are independent of each other. Let A be a data tuple and S be a hypothesis such that data tuple A belongs to a class C. In order to classify, P(S|A) has to be determined. P(S|A) = P(A|S) P(S) P(A) 5. DECISION STUMP AND BEST FIRST TREE A decision tree is a flowchart branched structure. Each internal node represents the test on the attribute. The branch nodes represent the outcome of the test and the leaf nodes hold the class label. Given a tuple T, with an unknown class label, the attribute values are tested with respect to the decision tree. The traced path from the root node to the leaf node holds the prediction of the tuple. The popularity of the decision tree is based on the construction of tree classifiers without any domain knowledge or any significant parameter setting. Decision stump is a one-level decision tree with a single split. There is only one internal node and two leaf nodes. Prediction is based on a single input feature. Although, it is a weak learning technique, but it proves to be a fast classification technique when used with boosting [13]. BFT works similar to a standard decision tree in the depth first expansion, but the technique helps us know new tree pruning methods using cross validation. Best first tree order is used for tree construction rather than a predefined fixed order. The best node maximally reduces impurity among other nodes available for splitting (i.e. non labelled leaf nodes). 6. EXPERIMENT AND RESULT We have performed analysis, simulation and forecast performance evaluation for the leaning algorithms in Rapidminer 5.0 [14]. It is an efficient open-source software used for modelling and validating various classification, leaning and rule based algorithms. The Rapid Application Development environment and graphical user interface with flow design makes modelling and simulation easier Rapidminer. The internal process and functions are also and described in XML. We first compare the performance of Neural net and SVM. The dataset considered consists of average annual and monthly temperature of India from the year 1901 to the year 2012. The dataset is downloaded from the Delhi Meteorological department website [15]. There are six attributes in the dataset, YEAR, ANNUAL, JAN-FEB, MAR-MAY, JUN-SEP, OCT-DEC. The attributes ANNUAL, consists the average annual temperature. Attributes JAN-FEB, MAR-MAY, JUN-SEP, OCT-DEC are the average annual monthly temperature. 6.1 Neural Net Learning model The following is the neural net model designed in Rapidminer
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. Fig. 1: Model design for Neural Net training in Rapidminer The input to the model is given in xls network in order for the system to learn from the given data. The set role operator changes the role of one or more attributes. In this case we have assigned the YEAR attribute as an ID and ANNUAL examples into a new single valued example set. The parameters consist of series representation, window size and step size. The series representation is set to the default value (encode series by example). The window size gives the width of the window or the number of examples considered at a time. Window size is set to a value of 2. The step size is basically the gap between the first values. The step size is set to one (default value). Next, the sliding window validator is used to encapsulate the training and test window examples in order to estimate the performance of a prediction operator. The parameters consist of training window width, training window step size and the test window width and the training window width and test window width is set to 7 and the training window size and the step size is set to 1. The horizon gives us the incremented value from first to the last example. In this case horizon value 1 means the prediction o temperature. The validation operator consists of 2 phases, training and testing .The training phase contains the neural net operator to learn from the data examples and the testing phase applies results the average performance of the system. The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles. The following was the analysis graph of the predicted and the observed annual temperature. Fig. 2: Comparison graph of predicted tempe International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 142 Model design for Neural Net training in Rapidminer The input to the model is given in xls format and there is also a training dataset given to the network in order for the system to learn from the given data. The set role operator changes the role of one or more attributes. In this case we have assigned the YEAR attribute as an ID and ANNUAL as the label. The windowing operator transforms a set of examples into a new single valued example set. The parameters consist of series representation, window size and step size. The series representation is set to the default value (encode series by mple). The window size gives the width of the window or the number of examples considered at a time. Window size is set to a value of 2. The step size is basically the gap between the first values. The step size is set to one (default value). liding window validator is used to encapsulate the training and test window examples in order to estimate the performance of a prediction operator. The parameters consist of training window width, training window step size and the test window width and the training window width and test window width is set to 7 and the training window size and the step size is set to 1. The horizon gives us the incremented value from first to the last example. In this case horizon value 1 means the prediction of the next example, i.e the prediction for the next average The validation operator consists of 2 phases, training and testing .The training phase contains the neural net operator to learn from the data examples and the testing phase applies results the average performance of the system. The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles. The following was the analysis graph of the predicted and the observed annual temperature. Comparison graph of predicted temperature by neural Net operator and the original Annual temperature International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), format and there is also a training dataset given to the The set role operator changes the role of one or more attributes. In this case we have assigned as the label. The windowing operator transforms a set of examples into a new single valued example set. The parameters consist of series representation, window size and step size. The series representation is set to the default value (encode series by mple). The window size gives the width of the window or the number of examples considered at a time. Window size is set to a value of 2. The step size is basically the gap between the first values. liding window validator is used to encapsulate the training and test window examples in order to estimate the performance of a prediction operator. The parameters consist of training window width, training window step size and the test window width and the horizon. The training window width and test window width is set to 7 and the training window size and the step size is set to 1. The horizon gives us the incremented value from first to the last example. In this case f the next example, i.e the prediction for the next average The validation operator consists of 2 phases, training and testing .The training phase contains the neural net operator to learn from the data examples and the testing phase applies the model and The learning rate of the neural net is tuned to 0.8 and we trained the system to tune of 500 cycles. The following was the analysis graph of the predicted and the observed annual temperature. operator and the original Annual
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. There were some changes found in the forecasting performance of the neural net model when we tuned the learning rate of the neural net operator. Learning rate 0.3 (default) Table 1: The table projects the learning rate and the corresponding forecasting performance of the The observed forecasting performance was found to follow a linear trend on changing the learning rate of the neural net. The performance of the system although decreased minutely (from 0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the performance variation with respect to the learning rate. Fig. 3: Learning rate graph for the Neural net model. 6.1 SVM learning model SVM, initially introduced by Vapnik is basically partitioned into two categories i.e Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and pattern recognition functions from the support vector we make use of Support Vector Regression. The main aim is to minimize the generalization error boun performance. In order to understand the learning mechanism in detail and its origin one can refer to [16]. The overall model of the SVM learner is similar as depicted in Fig.1. But the validation phase holds the SVM regression operator instead of the Neural Net operator. The setup for the general model is same as discussed earlier in section VI. The SVM operator is although tuned on certain parameters. The kernel function of the operator is set to dot type. The kernel func actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C constant) depicts the tolerance level for the misc International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 143 There were some changes found in the forecasting performance of the neural net model when we tuned the learning rate of the neural net operator. earning rate Forecasting performance (%) 0.3 (default) 67.9 0.4 67.7 0.5 67.5 0.6 68.4 0.7 68.2 0.8 68.7 The table projects the learning rate and the corresponding forecasting performance of the Neural Net model performance was found to follow a linear trend on changing the learning rate of the neural net. The performance of the system although decreased minutely (from 0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the formance variation with respect to the learning rate. Learning rate graph for the Neural net model. SVM, initially introduced by Vapnik is basically partitioned into two categories i.e Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and pattern recognition functions from the support vector we make use of Support Vector Regression. The main aim is to minimize the generalization error bound in order to achieve generalized performance. In order to understand the learning mechanism in detail and its origin one can refer to The overall model of the SVM learner is similar as depicted in Fig.1. But the validation sion operator instead of the Neural Net operator. The setup for the general model is same as discussed earlier in section VI. The SVM operator is although tuned on certain parameters. The kernel function of the operator is set to dot type. The kernel func actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C constant) depicts the tolerance level for the misclassification. Higher values for this parameter lead International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), There were some changes found in the forecasting performance of the neural net model when The table projects the learning rate and the corresponding forecasting performance of the performance was found to follow a linear trend on changing the learning rate of the neural net. The performance of the system although decreased minutely (from 0.3 to 0.5) but increased by a certain factor thereafter. The graph depicted below shows the SVM, initially introduced by Vapnik is basically partitioned into two categories i.e Support Vector Classification (SVC) and Support Vector Regression (SVR). To unveil the prediction and pattern recognition functions from the support vector we make use of Support Vector Regression. d in order to achieve generalized performance. In order to understand the learning mechanism in detail and its origin one can refer to The overall model of the SVM learner is similar as depicted in Fig.1. But the validation sion operator instead of the Neural Net operator. The setup for the general model is same as discussed earlier in section VI. The SVM operator is although tuned on certain parameters. The kernel function of the operator is set to dot type. The kernel function actually helps the data to be mapped into a higher dimensional space. In this case, the dot kernel is defined by the inner product of the support vectors i.e k(x,y)=x*y. The SVM complexity constant (C lassification. Higher values for this parameter lead
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. to softer boundaries and lower values lead to harder boundaries. The parameter has to be set carefully since values above optimum level can result in over generalization of data points. The C constant has been set to a value of 0.1 from the default value of 0.0. We observed that on increasing the value of C constant the forecasting performance slightly reduced due to over generalization. However tuning the value of C higher than 0.2 the forecasting perfo declined but followed a linear trend. C constant 0.1 0.2 0.3 0.4 0.5 0.6 Table 2: Table depicts the C constant and the forecasting performance of the SVM model Fig. 4: Forecasting performance graph with respect to C constant in SVM model On execution of the model with a C constant tuned to 0.1, the forecasting performance was observed to be 71.5% which is higher than the highest performance of the neural net operator. The following trend was observed for SVR prediction function Fig. 5: Comparison graph between predicted temperature of SVM and the actual Annual temperature International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 144 to softer boundaries and lower values lead to harder boundaries. The parameter has to be set carefully since values above optimum level can result in over generalization of data points. as been set to a value of 0.1 from the default value of 0.0. We observed that on increasing the value of C constant the forecasting performance slightly reduced due to over generalization. However tuning the value of C higher than 0.2 the forecasting perfo C constant Forecasting performance (%) 0.1 71.5 0.2 69.1 0.3 69.1 0.4 68.9 0.5 68.7 0.6 68.4 Table depicts the C constant and the forecasting performance of the SVM model Forecasting performance graph with respect to C constant in SVM model On execution of the model with a C constant tuned to 0.1, the forecasting performance was observed to be 71.5% which is higher than the highest performance of the neural net operator. The following trend was observed for SVR prediction function Comparison graph between predicted temperature of SVM and the actual Annual temperature International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), to softer boundaries and lower values lead to harder boundaries. The parameter has to be set carefully since values above optimum level can result in over generalization of data points. as been set to a value of 0.1 from the default value of 0.0. We observed that on increasing the value of C constant the forecasting performance slightly reduced due to over generalization. However tuning the value of C higher than 0.2 the forecasting performance slightly Table depicts the C constant and the forecasting performance of the SVM model On execution of the model with a C constant tuned to 0.1, the forecasting performance was observed to be 71.5% which is higher than the highest performance of the neural net operator. Comparison graph between predicted temperature of SVM and the actual Annual temperature
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. However, the prediction may not be very accurate in determining the average temperature due to lack of a proper pattern and unpredictability of temperature change. But the mapping of non linear data into a higher dimensional space and the ability to determine patterns is more proficient in SVM than Neural Net. 6.2 Naïve Bayes model For the comparison of classification model we have selected the default dataset available in Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean classification. The dataset consists of 5 attributes Play, Outlook, Temp attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to the dataset, if the day is windy there will be no game that particular day (Play=No if Wind=True). The attributes which are to be classified are first selected from the dataset. In this case the attributes Wind and Outlook are selected in order to identify and classify the trend. The algori Bayes processes and calculates the probability for all labels but it selects only maximum valued label from the dataset. For instance the calculation for label= yes would be probability of label = yes (i.e. 9/14) value fr yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the answer = 9/14*0.331*0.333 = 0.071. The Naïve Bayes model design is depicted below Fig. 6: Naïve The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator processes it and the resulting model is applied on the ‘Golf after the I Bayes operator. In the given dataset 9 out of 14 examples of the training set have label = yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the label = no is 5/14. However, in the testing set, the attributes of t and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the confusion matrix [17] depicts the actual and predicted classification done by the classification operator. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 145 However, the prediction may not be very accurate in determining the average temperature lack of a proper pattern and unpredictability of temperature change. But the mapping of non linear data into a higher dimensional space and the ability to determine patterns is more proficient in For the comparison of classification model we have selected the default dataset available in Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean The dataset consists of 5 attributes Play, Outlook, Temperature, Humidity, Wind. The attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to ere will be no game that particular day (Play=No if Wind=True). The attributes which are to be classified are first selected from the dataset. In this case the attributes Wind and Outlook are selected in order to identify and classify the trend. The algori Bayes processes and calculates the probability for all labels but it selects only maximum valued label from the dataset. For instance the calculation for label= yes would be following: posterior probability of label = yes (i.e. 9/14) value from distribution table when Outlook = rain and label = yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the answer = 9/14*0.331*0.333 = 0.071. The Naïve Bayes model design is depicted below Naïve Bayes process model in Rapidminer. The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator processes it and the resulting model is applied on the ‘Golf-testset’ data set. A breakpoint is inserted ator. In the given dataset 9 out of 14 examples of the training set have label = yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the label = no is 5/14. However, in the testing set, the attributes of the first example are Outlook = sunny and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the confusion matrix [17] depicts the actual and predicted classification done by the classification International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), However, the prediction may not be very accurate in determining the average temperature lack of a proper pattern and unpredictability of temperature change. But the mapping of non- linear data into a higher dimensional space and the ability to determine patterns is more proficient in For the comparison of classification model we have selected the default dataset available in Rapidminer i.e Golf dataset. The dataset is a sample dataset which can be used for Boolean erature, Humidity, Wind. The attributes Play and Wind consist of Boolean value (True and Flase).Outlook contains 3 values overcast, rain and sunny. Temperature and Humidity attributes consist of real values. According to ere will be no game that particular day (Play=No if Wind=True). The attributes which are to be classified are first selected from the dataset. In this case the attributes Wind and Outlook are selected in order to identify and classify the trend. The algorithm for Naïve Bayes processes and calculates the probability for all labels but it selects only maximum valued label following: posterior om distribution table when Outlook = rain and label = yes (i.e. 0.331) value from distribution table when Wind = true and label = yes (i.e. 0.333) Thus the The Select Attribute operator selects just Outlook and Wind attributes. The I Bayes operator testset’ data set. A breakpoint is inserted ator. In the given dataset 9 out of 14 examples of the training set have label = yes, thus the posterior probability of the label = yes is 9/14. Similarly the posterior probability of the he first example are Outlook = sunny and Wind = false. The following is the performance vector for the Naïve Bayes operator. Here the confusion matrix [17] depicts the actual and predicted classification done by the classification
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. Fig. 8: Performance vector statistics for Naïve Bayes process model. The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs surprisingly well assuming that labels of class are independent. But, the performance of the algorithm is appreciable in case when the training data is small. 6.3 Best First decision Tree model (BFT) The design of BFT model is similar to the Naïve Bayes used Golf dataset. Fig. 7: BFT operator is a class for building a best split for both nominal and numeric attributes. For m International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 146 ormance vector statistics for Naïve Bayes process model. The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs risingly well assuming that labels of class are independent. But, the performance of the algorithm is appreciable in case when the training data is small. Best First decision Tree model (BFT) The design of BFT model is similar to the Naïve Bayes model. The input dataset is the earlier BFT process model in Rapidminer. BFT operator is a class for building a best-first decision tree classifier. This class uses binary split for both nominal and numeric attributes. For missing values, fractional instance method is used. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ormance vector statistics for Naïve Bayes process model. The calculation process for the accuracy, recall, confusion matrix, AUC (optimistic and pessimistic can be referred from [17]. Overall Naïve Bayes is an easy algorithm to implement. It performs risingly well assuming that labels of class are independent. But, the performance of the model. The input dataset is the earlier first decision tree classifier. This class uses binary issing values, fractional instance method is used.
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. Fig. 9: Performance vector statistics for Best 6.4 Decision stump model Our next experiment is to compare decision stump. A decision stump is basically a decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a single input feature. In this experiment we take a sample dataset from Rapidminer called Labour The dataset consists of 17 attributes with some missing values. The decision tree splits the table into 2 classes good and bad according to the wage, long statutory holidays. The minimal gain i.e the parameter that controls the size of the value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this parameter can lead to very few splits but can reduce the performance of the classification. When we split the dataset in the form of a decisi following decision tree is formed Fig. 10: Decision tree structure for Labour International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 147 Performance vector statistics for Best-First decision tree process model. Our next experiment is to compare decision stump. A decision stump is basically a decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a In this experiment we take a sample dataset from Rapidminer called Labour tributes with some missing values. The decision tree splits the table into 2 classes good and bad according to the wage, long-term disability assistance, working hours and statutory holidays. The minimal gain i.e the parameter that controls the size of the value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this parameter can lead to very few splits but can reduce the performance of the classification. When we split the dataset in the form of a decision tree with a limit of 4 splits and 2 minimal leaves, the Decision tree structure for Labour-Negotiations dataset. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), First decision tree process model. Our next experiment is to compare decision stump. A decision stump is basically a one-level decision tree. It consists of a single node and 2 leaves. The prediction is made on the basis of just a In this experiment we take a sample dataset from Rapidminer called Labour-Negotiations. tributes with some missing values. The decision tree splits the table into term disability assistance, working hours and statutory holidays. The minimal gain i.e the parameter that controls the size of the tree is set to a value of 0.1.The gain of a node is calculated before splitting a tree. A very high value in this parameter can lead to very few splits but can reduce the performance of the classification. When we on tree with a limit of 4 splits and 2 minimal leaves, the
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 148 Now, feeding the same dataset to the decision stump operator would result in the following decision tree Fig. 11: Decision stump structure for the Labour-negotiations dataset. The two classes defined are ‘good’ and ‘bad’ In the first decision tree the wage-inc-1st is classified into various levels of long-term- disability, working hours, statutory holidays. The classification of labels is performed according to values that are greater, equal or smaller to a certain limit given in the dataset. The decision stump clearly generalizes, gives the prediction and classifies the data points to predefined classes. The classification is done on the basis of an input feature. The input feature can be selected by the user or it can be suggested by other learning techniques. This classification technique can be used in association with other learning techniques to analyse, predict and classify time series and patterns. This technique is much faster, generalized and easy to implement compared to other types of decision trees. 7. CONCLUSION We have performed experiments and compared learning algorithms including Neural Net and SVM .The results portrayed that both Neural Net and SVM although are capable to model both linear and non-linear datasets. The Neural Net is more proficient when it comes to model complex datasets whereas SVM is more successful in classifying and modelling non- linear relationships. The mapping of features into a higher dimensional space is efficient yet useful while predicting and recognizing patterns in SVM. The performance of SVM was found to be better than Neural Net while predicting. Naïve Bayes is a simple classifier based on Bayes theorem. It is simple to implement and proves to be successful and competent against various classification algorithms. The accuracy of Naïve Bayes was slightly lower than BFT. Naïve Bayes though is easy to implement and efficient, it can sometimes deviate from optimum performance when the size of the dataset exceptionally increases. Decision Stump is perhaps a generalizing tree structure that uses a single node and 2 leaves to represent and classify data points. It proves to be quite efficient if the feature selection by the user is apt for optimum classification. These algorithms, if allowed to perform in the right combination can be more efficient in learning, feature selection and classification. In the future we plan to analyse and optimize learning algorithms for complex and fuzzy datasets. We also plan to provide a framework for learning and decision making from complex datasets.
  • 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 138-149 © IAEME 149 REFERENCES 1. Lippmann, Richard. "An introduction to computing with neural nets." ASSP Magazine, IEEE 4.2 (1987): 4-22. 2. Hearst, Marti A., et al. "Support vector machines." Intelligent Systems and their Applications, IEEE 13.4 (1998): 18-28. 3. Firth, W. J. "Chaos--predicting the unpredictable." BMJ: British Medical Journal303.6817 (1991): 1565. 4. Baboo, S. Santhosh, and I. Kadar Shereef. "An efficient weather forecasting system using artificial neural network." International journal of environmental science and development 1.4 (2010). 5. Radhika, Y., and M. Shashi. "Atmospheric temperature prediction using support vector machines." International Journal of Computer Theory and Engineering 1.1 (2009). 6. Rish, Irina. "An empirical study of the naive Bayes classifier." IJCAI 2001 workshop on empirical methods in artificial intelligence. Vol. 3. No. 22. 2001. FLEXChip Signal Processor (MC68175/D), Motorola, 1996. 7. Shi, Haijian. Best-first decision tree learning. Diss. The University of Waikato, 2007. 8. Witten, Ian H., et al. "Weka: Practical machine learning tools and techniques with Java implementations." (1999). 9. RapidMiner, RapidMiner. "Open Source Data Mining." (2009). 10. White, Halbert. "Learning in artificial neural networks: A statistical perspective."Neural computation 1.4 (1989): 425-464. 11. Hall, Mark, et al. "The WEKA data mining software: an update." ACM SIGKDD Explorations Newsletter 11.1 (2009) : 10-18. 12. Bayes, Naïve. "Naïve Bayes Classifier." 13. Zhao, Yongheng, and Yanxia Zhang. "Comparison of decision tree methods for finding active objects." Advances in Space Research 41.12 (2008): 1955-1959. 14. Land, Sebastian, and Simon Fischer. "RapidMiner 5." 15. http://data.gov.in/dataset/annual-and-seasonal-maximum-temperature-india. 16. Cortes, Corinna, and Vladimir Vapnik. "Support vector machine." Machine learning 20.3 (1995): 273-297. 17. "Confusion Matrix." Confusion Matrix. 16 Oct. 2013 <http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html. 18. Dr. Rajeshwari S. Mathad, “Supervised Learning in Artificial Neural Networks”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 5, Issue 3, 2014, pp. 208 - 215, ISSN Print: 0976-6480, ISSN Online: 0976-6499. 19. Sandip S. Patil and Asha P. Chaudhari, “Classification of Emotions from Text using SVM Based Opinion Mining”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 330 - 338, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 20. V.Anandhi and Dr.R.Manicka Chezian, “Comparison of the Forecasting Techniques – ARIMA, ANN and SVM - A Review”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 370 - 376, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 21. Tarun Dhar Diwan and Upasana Sinha, “The Machine Learning Method Regarding Efficient Soft Computing and ICT using SVM”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 1, 2013, pp. 124 - 130, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.