SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 86 | P a g e
Hypothesis on Different Data Mining Algorithms
Shraddha Deshmukh*, Swati Shinde**
*(Department of Information Technology, University of Pune, Pune - 05
** (Department of Information Technology, University of Pune, Pune - 05
ABSTRACT
In this paper, different classification algorithms for data mining are discussed. Data Mining is about
explaining the past & predicting the future by means of data analysis. Classification is a task of data mining,
which categories data based on numerical or categorical variables. To classify the data many algorithms are
proposed, out of them five algorithms are comparatively studied for data mining through classification. There are
four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions &
Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors,
Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark
datasets like Iris & Lung Cancer.
Keywords - Artificial Neural Network, Classification, Data Mining, Decision Tree, K-Nearest Neighbors, Naive
Bayesian & Support Vector Machine.
I. INTRODUCTION
Nowadays large amount of data is being
gathered and stored in databases everywhere across
the globe and it is increasing continuously. Different
organizations & research centers are having data in
terabytes. That is over 1000 Terabytes of data. So,
we need to mine those databases for better use. Data
Mining is about explaining the past & predicting the
future. Data mining is a collaborative field which
combines technologies like statistics, machine
learning, artificial intelligence & database. The
importance of data mining applications is predicted
to be huge. Many organizations have collected
tremendous data over years of operation & data
mining is process of knowledge extraction from
gathered data. The organizations are then able to use
the extracted knowledge for more clients, sales &
greater profits. This is also true in the engineering &
medical fields.
1.1 DATA MINING
Data mining is process of organising available
data in useful format. Fig.1 shows basic concept of
data mining. Basic terms in data mining are:
 Statistics: The science of collecting, classifying,
summarizing, organizing, analysing &
interpreting data.
 Artificial Intelligence: The study of computer
algorithms which simulates intelligent
behaviours for execution of special activities.
 Machine Learning: The study of computer
algorithms to grasp the experiences and use it for
computerization.
 Database: The science & technology of
collecting, storing & managing data so users can
retrieve, insert, modify or delete such data.
 Data warehousing: The science & technology of
collecting, storing & managing data with
advanced multi-dimensional reporting services in
support of the decision making processes.
 Predicting The Future: Data mining predicts the
future by means of modelling.
 Modelling: Modelling is the process in which
classification model is created to predict an
outcome.
Fig. 1. Concept of data mining
II. CLASSIFICATION
Classification is a data mining task of predicting
the value of a categorical variable (target or class) by
building a model based on one or more algebraic
and/or categorical variables (predictors or attributes).
It Classifies data based on the training set & class
labels. Examples:
 Classifying patients by their symptoms,
 Classifying goods by their properties, etc.
There are some common terms used in
classification process. Table 1 illustrates basic terms
used in classification process like pattern (records,
rows), attributes (dimensions, columns), class
(output column) and class label (tag of class):
RESEARCH ARTICLE OPEN ACCESS
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 87 | P a g e
TABLE -1: Terms Used in Classification
Classification is a method of data mining for
predicting the value for data instances by using
previous experiences. Since we want to predict
either a positive or a negative response, we will
build a binary classification model. Classification is
important because it helps scientists to clearly
diagnose problems, study & observe them &
organize concentrated conservation efforts. It also
assists as a way of remembering & differentiating
the types of symptoms, making predictions about
diseases of the same type, classifying the
relationship between different defects & providing
precise names for diseases.
2.1 Applications of Classification
Classification have several applications like
Medical Diagnosis, Breast Cancer Diagnosis,
Market Targeting, Image processing, Wine
Classification, Solid Classification for selection of
fertilizer, etc.
III. CLASSIFICATION ALGORITHMS
There is quite a lot of research on algorithms that
classifies data. Several approaches have been
developed for classification in data mining. Fig 2
shows hierarchy of classification algorithms:
Fig. 2. Hierarchy of classification algorithms
IV. NAÏVE BAYESIAN
4.1 Introduction to Naïve Bayesian
The Naive Bayesian (NB) method is a simple
probabilistic classifier based on Bayes Theorem
(from Bayesian statistics) with strong (naive)
independence premises which assumes that all the
features are unique. NB model is easy to build, with
no complicated iterative parameter estimation which
makes it particularly useful for very large datasets
[1].
In the NB classifiers, every feature can help
determining which topic should be appointed to a
given input value. To choose a topic for an input
value, the naive Bayes classifier begins by
evaluating the prior probability of each topic, which
is determined by checking the frequency of each
topic in the training set. The input from each feature
is then mixed with this previous probability, to
arrive at a probability estimate for each topic. If the
estimated probability is the highest is then assigned
to the testing inputs [5].
A supervised classifier is built on training
corpora containing the correct topic for each input.
The framework used by Bayesian classification is
shown in Fig.3
Fig. 3. Bayesian classification
(a) During the training, a feature extractor is used to
convert each input value to a feature set. These
feature sets, which capture the basic information
about each input that should be used to classify it,
are discussed in the next section. Pairs of feature sets
& topics are fed into the machine learning algorithm
to generate a model. (b) During the prediction, the
same feature extractor is used to convert unseen
inputs to feature sets. These feature sets are then fed
into the model, which generates predicted topics [5].
4.2 Advantages to Naive Bayesian
1. Fast to train & classify
2. Not sensitive to irrelevant features
3. Handles real & discrete data
4. Handles streaming data well
4.3 Disadvantages to Naive Bayesian
1. Assumes autonomy of aspects
2. Dependencies exist among variables (ex.
Hospital: Patients, Diseases: Diabetes, Cancer)
are not modelled by NB.
V. K NEAREST NEIGHBOR
5.1 Introduction to K-Nearest Neighbor
K nearest neighbor (KNN) is a simple method
that stores all available cases & classifies new cases
based on a similarity measure (e.g. euclidean). KNN
has been used in statistical estimation & pattern
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 88 | P a g e
recognition already in the beginning of 1970’s as a
non-parametric technique. The closest neighbor rule
distinguishes the classification of unknown data
point on the basis of its closest neighbor whose class
is already known [4].
The training points are selected weights
according to their distances from sample data point.
But at the same time the computational complexity
& memory requirements remain the essential things.
To overcome from memory restriction size of
dataset is minimized. For this the repeated patterns
which don’t include additional data are also
excluded from training data set. To further
strengthen the information focuses which don’t
influence the result are additionally eliminated from
training data set [4]. The NN training dataset can be
formed for utilizing different systems to boost over
memory restriction of KNN. The KNN
implementation can be done using ball tree, k-d tree,
nearest feature line (NFL), principal axis search tree
& orthogonal search tree.
Next, the tree structured training data is divided
into nodes & techniques like NFL & tunable metric
divide the training data set according to planes. The
speed of basic KNN algorithm can be increase by
using these algorithms. Consider that an object is
sampled with a set of different attributes.
5.2 Advantages to K-Nearest Neighbors
1. No assumptions about the characteristics of the
concepts to learn have to be done
2. Complex concepts can be learned by local
approximation using simple procedures
3. Very simple classifier that works well on basic
recognition problems.
5.3 Disadvantages to K-Nearest Neighbors
1. The model cannot be interpreted
2. It is computationally expensive to find the KNN
when the dataset is very large
3. It is a lazy learner; i.e. it does not learn anything
from the training data & simply uses the training
data itself for classification.
VI. DECISION TREE
6.1 Introduction to Decision Tree
A decision tree (DT) is a decision support tool
that uses a tree-like graph or model of decisions &
their possible effects, including chance event results,
assets cost & utility. It is the only way to display an
algorithm.
The decision tree is a method for information
portrayal evolved in the 60s. It can resolve the class
label of test patterns by using set value of attribute.
DT is a cycle free graph which has nodes as
attributes to support decisions. The tree branch
represents a precedence connection between the
nodes [6]. The value of a branch is an element of the
attribute value set of the branch’s parent node. The
attributes are nodes with at least two children,
because an attribute has got as many branches as the
cardinality of the value set of the actual attribute.
The root of the tree is the common ancestor attribute,
from where the classification can be started. The
leaves represent class nodes of the tree. In every
relation the class is only a child, so it is a leaf of the
tree in every case [6].
DT builds classification model in the form of a
tree. It breaks a dataset into smaller subsets and an
associated decision tree is incrementally developed.
The final result is a tree with decision nodes & leaf
nodes. A decision node has two or more branches
and leaf node represents a decision. The topmost
node also called as root node which corresponds to
the best predictor. DT can handle both categorical &
numerical data [4]. DT helps formalize the
brainstorming process so we can identify more
potential solutions.
6.2 Advantages to Decision Tree
1. Easy to interpretation.
2. Help determine worst, best & expected values for
different scenarios.
6.3 Disadvantages to Decision Tree
1. Determination can get very complex if many
values are ambiguous.
VII. ARTIFICIAL NEURAL NETWORK
6.1 Introduction to Artificial Neural Network
Artificial neural networks (ANNs) are types of
computer architecture inspired by nervous systems
of the brain & are used to approximate functions that
can depend on a large number of inputs & are
generally unknown. ANN are presented as systems
of interconnected “neurons” which can compute
values from inputs & are capable of machine
learning as well as pattern recognition due their
adaptive nature [4]. The brain basically learns from
experience. It is natural proof that some problems
that are beyond the scope of current computers are
indeed solvable by small energy efficient packages.
This brain modeling also promises a less technical
way to develop machine solutions. This new
approach to computing also provides a more
graceful degradation during system overload than its
more traditional counterparts [8].
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 89 | P a g e
A neural network is a massively parallel-
distributed processor made up of simple processing
units, which has a natural propensity for storing
experimental knowledge & making it available for
use. Neural network are also referred to in literature
as neuro computers, connectionist networks,
parallel-distributed processors, etc. A typical neural
network is shown in the fig. 4.Where input, hidden
& output layers are arranged in a feed forward
manner [8]. The neurons are strongly interconnected
& organized into different layers. The input layer
receives the input & the output layer produces the
final output. In general one or more hidden layers
are sandwiched in between the two [4]. This
structure makes it impossible to forecast or know the
exact flow of data.
Input Layer Hidden Layer Output Layer
Fig. 4. A simple neural network
ANN typically starts out with randomized
weights for all their neurons. This means that
initially they must be trained to solve the particular
problem for which they are proposed. During the
training period, we can evaluate whether the ANN’s
output is correct by observing pattern. If it’s correct
the neural weightings that produced that output are
reinforced; if the output is incorrect, those
weightings responsible can be diminished [4]. An
ANN is useful in a variety of real-world applications
such as visual pattern recognition, speech
recognition and programs for text-to-speech that deal
with complex often incomplete data.
6.2 Advantages to Artificial Neural Network
1. It is easy to use, with few parameters to adjust.
2. A neural network learns & reprogramming is not
needed.
3. Applicable to a wide range of problems in real
life.
6.3 Disadvantages to Artificial Neural
Network
1. Requires high processing time if neural network
is large.
2. Learning can be slow.
VIII. SUPPORT VECTOR MACHINE
8.1 Introduction to Support Vector Machine
A Support Vector Machine (SVM) performs
classification by finding the hyperplane that
maximizes the margin between the two classes. The
vectors (cases) that define the hyperplane are the
support vectors.
The beauty of SVM is that if the data is linearly
separable, there is a unique global minimum value.
An ideal SVM analysis should produce a hyperplane
that completely separates the vectors (cases) into
two non-overlapping classes. However, perfect
separation may not be possible, or it may result in a
model with so many cases that the model does not
classify correctly. In this situation SVM finds the
hyperplane that maximizes the margin & minimizes
the misclassifications [4].The algorithm tries to
maintain the slack variable to zero while maximizing
margin. However, it does not minimize the number
of misclassifications (NP-complete problem) but the
sum of distances from the margin hyperplanes [9].
The simplest way to separate two groups of data
is with a straight line (1 dimension), flat plane (2
dimensions) or an N-dimensional hyperplane as
shown in Fig. 5.
Fig. 5. Hyperplane in SVM
The main reason you would want to use an SVM
instead of a Logistic Regression is because problem
might not be linearly separable and if you are in a
highly dimensional space [9].
8.2 Advantages to Support Vector Machine
1. Most robust & accurate classification technique
for binary problems.
2. Memory-intensive.
8.3 Disadvantages to Support Vector
Machine
1. It can be painfully inefficient to train
2. High complexity & extensive memory
requirements for classification in many cases.
3. Supports only binary classification.
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 90 | P a g e
IX. COMPARATIVE STUDY OF ALGORITHMS
As per comparative analysis from Table 2 we can
say that, ANN is better for data classification. Neural
network allows learning from experiences &
supports decision making, classification, Pattern
recognition, etc. Neural networks often exhibit
patterns similar to those exhibited by humans.
However this is more of interest in cognitive
sciences than for practical examples.
TABLE -2: Comparative Study of Algorithms
Algo. Approach Features Flaws
NB Frequency
Table
 Simple to
implement.
 Great
Computatio
nal
efficiency &
classificatio
n rate.
 It predicts
accurate
results for
most of the
classificatio
n &
prediction
problems.
 The
precision of
algorithm
decreases if
the amount
of data is
less.
 For
obtaining
good results
it requires a
very large
number of
records.
KNN Similarity
Function
 Classes need
not be
linearly
separable.
 Zero cost of
the learning
process.
 Well suited
for
multimodal
classes.
 Time to find
the nearest
Neighbors
in a large
training data
set can be
excessive.
 Performance
of algorithm
depends on
the number
of
dimensions
used
DT Frequency
Table
 It produces
the more
accuracy
result than
the C4.5
algorithm.
 Detection
rate is
increase &
space
consumption
is reduced.
 Requires
large
searching
time.
 Sometimes
it may
generate very
long rules
which are
very hard to
prune.
 Requires
large amount
of memory
to store tree.
ANN Others  It is easy to
use &
implement,
with few
parameters
to adjust.
 A neural
network
learns &
reprogrammi
ng is not
needed.
 Applicable
to a wide
range of
problems in
real life.
 Requires
high
processing
time if
neural
network is
large.
 Learning
can be
slow.
SVM Others  High
accuracy.
 Work well
even if data
is not
linearly
separable in
the base
feature
space.
 Speed &
size
requiremen
t both in
training &
testing is
more.
 High
complexit
y &
extensive
memory
requireme
nts for
classificati
on in
many
cases.
X. EXPERIMENTAL ANALYSIS
To examine all studied methods Lung Cancer &
Iris benchmark datasets are used. Results are for
100% training & 100% testing scenario. Result
analysis on the basis of accuracy is given in Table 3.
Accuracy is calculated as:
Total number of correctly classified data
Accuracy = -------------------------------------------------
Total number of data
TABLE -3: Result Analysis of Methods Using
WEKA Tool
Iris Lung Cancer
NB 94.7% 97.4%
KNN 94.0% 97.0%
DT 95.0% 98.0%
SVM 65.0% 98.5%
ANN 98.3% 100%
Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91
www.ijera.com 91 | P a g e
0
20
40
60
80
100
NB KNN DT SVM ANN
Iris
Lung Cancer
Fig 6: Chart of Result Analysis
XI. CONCLUSION
In this paper, we have studied five different
classification methods based on approaches like,
frequency tables, covariance matrix, similarity
functions & others. Those algorithms are NB, KNN,
DT, ANN & SVM. As per comparative study in
between all algorithms we reached to conclusion that
ANN is most suitable & efficient technique for
Classification. ANNs are considered as simplified
mathematical models of human brain & they
function as parallel distributed computing networks.
ANNs are universal function approximates, &
usually they deliver good performance in
applications. ANN has generalization ability as well
as learnability. It is easy to use, implement &
applicable to real world problems.
There is a huge scope in this area of
classification by using different methods of ANN
like Fuzzy with ANN, Neuro-fuzzy, Genetic
Approach, etc. in Artificial Neural Network.
REFERENCES
Journal Papers:
[1] M Ozaki, Y. Adachi, Y. Iwahori, and N. Ishii, Application
of fuzzy theory to writer recognition of Chinese characters,
International Journal of Modeling and Simulation, 18(2),
1998, 112-116.
[2] R. Andrews, J. Diederich & A. B., Tickle, “Survey &
critique of techniques for extracting rules from trained
artificial neural networks,” Knowledge Based System, vol. 8,
no. 6, pp. 373-389, 1995.
[3] Rashedur M. Rahman, Farhana Afroz, “Comparison of
Various Classification Techniques Using Different Data
Mining Tools for Diabetes Diagnosis”, Journal of Software
Engineering & Applications, 6, 85-97, 2013.
[4] Wang Xin, Yu Hongliang, Zhang Lin, Huang Chaoming,
Duan Jing, “Improved Naive Bayesian Classifier Method &
the Application in Diesel Engine Valve Fault Diagnostic”,
Third International Conference On Measuring Technology
& Mechatronics Automation, 2011.
[5] Sagar S. Nikam, “A Comparative Study Of Classification
Techniques In Data Mining Algorithms”, Oriental Journal
Of Computer Science & Technology, April 2015.
[6] Zolboo Damiran, Khuder Altangerelt, “Text Classification
Experiments On Mongolian Language”, IEEE Conference,
Jul 2013.
[7] Zolboo Damiran, Khuder Altangerel, “Author Identification:
An Experiment based on Mongolian Literature using
Decision Tree”, IEEE Conference, 2013.
[8] Essaid el Haji, Abdellah Azmani, Mohm el Harzli, “A
pairing individual-trades system , using KNN method”,
IEEE Conference, 2014.
[9] Kumar Abhishek, Abhay Kumar, Rajeev Ranjan, Sarthak
K., “A Rainfall Prediction Model using Artificial Neural
Network”, IEEE Conference, 2012.
[10] Erlin, Unang Rio, Rahmiati, “Text Message Categorization
of Collaborative Learning Skills in Online Discussion Using
SVM”, IEEE Conference, 2013.

Weitere ähnliche Inhalte

Was ist angesagt?

A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryIJERA Editor
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
Research scholars evaluation based on guides view using id3
Research scholars evaluation based on guides view using id3Research scholars evaluation based on guides view using id3
Research scholars evaluation based on guides view using id3eSAT Journals
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Miningijsrd.com
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: SynopsisJagdeep Singh Malhi
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292HARDIK SINGH
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Research scholars evaluation based on guides view
Research scholars evaluation based on guides viewResearch scholars evaluation based on guides view
Research scholars evaluation based on guides vieweSAT Publishing House
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniquesijsrd.com
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...IJDKP
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentIJDKP
 
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSMULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSijcsit
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionijtsrd
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Miningijdmtaiir
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
report.doc
report.docreport.doc
report.docbutest
 

Was ist angesagt? (20)

A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
Research scholars evaluation based on guides view using id3
Research scholars evaluation based on guides view using id3Research scholars evaluation based on guides view using id3
Research scholars evaluation based on guides view using id3
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Mining
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: Synopsis
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Research scholars evaluation based on guides view
Research scholars evaluation based on guides viewResearch scholars evaluation based on guides view
Research scholars evaluation based on guides view
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniques
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environment
 
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSMULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement prediction
 
Data Mining
Data MiningData Mining
Data Mining
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Mining
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
report.doc
report.docreport.doc
report.doc
 

Andere mochten auch

A Wind driven PV- FC Hybrid System and its Power Management Strategies in a Grid
A Wind driven PV- FC Hybrid System and its Power Management Strategies in a GridA Wind driven PV- FC Hybrid System and its Power Management Strategies in a Grid
A Wind driven PV- FC Hybrid System and its Power Management Strategies in a GridIJERA Editor
 
Improved Reliability Memory’s Module Structure for Critical Application Systems
Improved Reliability Memory’s Module Structure for Critical Application SystemsImproved Reliability Memory’s Module Structure for Critical Application Systems
Improved Reliability Memory’s Module Structure for Critical Application SystemsIJERA Editor
 
Analysis of soil arching effect with different cross-section anti-slide pile
Analysis of soil arching effect with different cross-section anti-slide pileAnalysis of soil arching effect with different cross-section anti-slide pile
Analysis of soil arching effect with different cross-section anti-slide pileIJERA Editor
 
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...IJERA Editor
 
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...IJERA Editor
 
A survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemA survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemIJERA Editor
 
Shortest Tree Routing With Security In Wireless Sensor Networks
Shortest Tree Routing With Security In Wireless Sensor NetworksShortest Tree Routing With Security In Wireless Sensor Networks
Shortest Tree Routing With Security In Wireless Sensor NetworksIJERA Editor
 
Identifying Structures in Social Conversations in NSCLC Patients through the ...
Identifying Structures in Social Conversations in NSCLC Patients through the ...Identifying Structures in Social Conversations in NSCLC Patients through the ...
Identifying Structures in Social Conversations in NSCLC Patients through the ...IJERA Editor
 
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.IJERA Editor
 
Identification and Investigation of Solid Waste Dump in Salem District
Identification and Investigation of Solid Waste Dump in Salem DistrictIdentification and Investigation of Solid Waste Dump in Salem District
Identification and Investigation of Solid Waste Dump in Salem DistrictIJERA Editor
 
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...IJERA Editor
 
Simulation based approach for Fixing Optimum number of Stages for a MMC
Simulation based approach for Fixing Optimum number of Stages for a MMCSimulation based approach for Fixing Optimum number of Stages for a MMC
Simulation based approach for Fixing Optimum number of Stages for a MMCIJERA Editor
 
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...IJERA Editor
 
from Ruby to Objective-C
from Ruby to Objective-Cfrom Ruby to Objective-C
from Ruby to Objective-CEddie Kao
 
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...IJERA Editor
 
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...IJERA Editor
 
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...IJERA Editor
 

Andere mochten auch (20)

A Wind driven PV- FC Hybrid System and its Power Management Strategies in a Grid
A Wind driven PV- FC Hybrid System and its Power Management Strategies in a GridA Wind driven PV- FC Hybrid System and its Power Management Strategies in a Grid
A Wind driven PV- FC Hybrid System and its Power Management Strategies in a Grid
 
Improved Reliability Memory’s Module Structure for Critical Application Systems
Improved Reliability Memory’s Module Structure for Critical Application SystemsImproved Reliability Memory’s Module Structure for Critical Application Systems
Improved Reliability Memory’s Module Structure for Critical Application Systems
 
Analysis of soil arching effect with different cross-section anti-slide pile
Analysis of soil arching effect with different cross-section anti-slide pileAnalysis of soil arching effect with different cross-section anti-slide pile
Analysis of soil arching effect with different cross-section anti-slide pile
 
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...
Evaluation of Energy Contribution for Additional Installed Turbine Flow in Hy...
 
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...
Laboratory Performance Of Evaporative Cooler Using Jute Fiber Ropes As Coolin...
 
A survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemA survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection System
 
Shortest Tree Routing With Security In Wireless Sensor Networks
Shortest Tree Routing With Security In Wireless Sensor NetworksShortest Tree Routing With Security In Wireless Sensor Networks
Shortest Tree Routing With Security In Wireless Sensor Networks
 
Identifying Structures in Social Conversations in NSCLC Patients through the ...
Identifying Structures in Social Conversations in NSCLC Patients through the ...Identifying Structures in Social Conversations in NSCLC Patients through the ...
Identifying Structures in Social Conversations in NSCLC Patients through the ...
 
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
 
Identification and Investigation of Solid Waste Dump in Salem District
Identification and Investigation of Solid Waste Dump in Salem DistrictIdentification and Investigation of Solid Waste Dump in Salem District
Identification and Investigation of Solid Waste Dump in Salem District
 
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...
Study of Effecting Factors on Housing Price by Hedonic Model A Case Study of ...
 
Examen fffinal
Examen fffinalExamen fffinal
Examen fffinal
 
Bullyngsara
BullyngsaraBullyngsara
Bullyngsara
 
Simulation based approach for Fixing Optimum number of Stages for a MMC
Simulation based approach for Fixing Optimum number of Stages for a MMCSimulation based approach for Fixing Optimum number of Stages for a MMC
Simulation based approach for Fixing Optimum number of Stages for a MMC
 
Åpen dag 2013
Åpen dag 2013Åpen dag 2013
Åpen dag 2013
 
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...
Phase Transformations and Thermodynamics in the System Fe2О3– V2О5 – MnО – Si...
 
from Ruby to Objective-C
from Ruby to Objective-Cfrom Ruby to Objective-C
from Ruby to Objective-C
 
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...
Implementation of RTOS on STM32F4 Microcontroller to Control Parallel Boost f...
 
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...
Synthesis, Characterization and Electrical Properties of Polyaniline Doped wi...
 
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...
How was Mathematics taught in the Arab-Islamic Civilization? – Part 1: The Pe...
 

Ähnlich wie Hypothesis on Different Data Mining Algorithms

Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
Classification Techniques: A Review
Classification Techniques: A ReviewClassification Techniques: A Review
Classification Techniques: A ReviewIOSRjournaljce
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...IRJET Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesIRJET Journal
 
Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...IJSRD
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for properIJDKP
 
Comparative study of classification algorithm for text based categorization
Comparative study of classification algorithm for text based categorizationComparative study of classification algorithm for text based categorization
Comparative study of classification algorithm for text based categorizationeSAT Journals
 
Comparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesComparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesIRJET Journal
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerIJERA Editor
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemIJSRD
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemIJSRD
 
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESA SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 

Ähnlich wie Hypothesis on Different Data Mining Algorithms (20)

Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Classification Techniques: A Review
Classification Techniques: A ReviewClassification Techniques: A Review
Classification Techniques: A Review
 
IJET-V2I6P32
IJET-V2I6P32IJET-V2I6P32
IJET-V2I6P32
 
Ijetcas14 338
Ijetcas14 338Ijetcas14 338
Ijetcas14 338
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for proper
 
Comparative study of classification algorithm for text based categorization
Comparative study of classification algorithm for text based categorizationComparative study of classification algorithm for text based categorization
Comparative study of classification algorithm for text based categorization
 
Comparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News ArticlesComparison of Text Classifiers on News Articles
Comparison of Text Classifiers on News Articles
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry System
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry System
 
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESA SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
 
Hx3115011506
Hx3115011506Hx3115011506
Hx3115011506
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 

Kürzlich hochgeladen

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 

Kürzlich hochgeladen (20)

data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 

Hypothesis on Different Data Mining Algorithms

  • 1. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 86 | P a g e Hypothesis on Different Data Mining Algorithms Shraddha Deshmukh*, Swati Shinde** *(Department of Information Technology, University of Pune, Pune - 05 ** (Department of Information Technology, University of Pune, Pune - 05 ABSTRACT In this paper, different classification algorithms for data mining are discussed. Data Mining is about explaining the past & predicting the future by means of data analysis. Classification is a task of data mining, which categories data based on numerical or categorical variables. To classify the data many algorithms are proposed, out of them five algorithms are comparatively studied for data mining through classification. There are four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions & Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors, Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark datasets like Iris & Lung Cancer. Keywords - Artificial Neural Network, Classification, Data Mining, Decision Tree, K-Nearest Neighbors, Naive Bayesian & Support Vector Machine. I. INTRODUCTION Nowadays large amount of data is being gathered and stored in databases everywhere across the globe and it is increasing continuously. Different organizations & research centers are having data in terabytes. That is over 1000 Terabytes of data. So, we need to mine those databases for better use. Data Mining is about explaining the past & predicting the future. Data mining is a collaborative field which combines technologies like statistics, machine learning, artificial intelligence & database. The importance of data mining applications is predicted to be huge. Many organizations have collected tremendous data over years of operation & data mining is process of knowledge extraction from gathered data. The organizations are then able to use the extracted knowledge for more clients, sales & greater profits. This is also true in the engineering & medical fields. 1.1 DATA MINING Data mining is process of organising available data in useful format. Fig.1 shows basic concept of data mining. Basic terms in data mining are:  Statistics: The science of collecting, classifying, summarizing, organizing, analysing & interpreting data.  Artificial Intelligence: The study of computer algorithms which simulates intelligent behaviours for execution of special activities.  Machine Learning: The study of computer algorithms to grasp the experiences and use it for computerization.  Database: The science & technology of collecting, storing & managing data so users can retrieve, insert, modify or delete such data.  Data warehousing: The science & technology of collecting, storing & managing data with advanced multi-dimensional reporting services in support of the decision making processes.  Predicting The Future: Data mining predicts the future by means of modelling.  Modelling: Modelling is the process in which classification model is created to predict an outcome. Fig. 1. Concept of data mining II. CLASSIFICATION Classification is a data mining task of predicting the value of a categorical variable (target or class) by building a model based on one or more algebraic and/or categorical variables (predictors or attributes). It Classifies data based on the training set & class labels. Examples:  Classifying patients by their symptoms,  Classifying goods by their properties, etc. There are some common terms used in classification process. Table 1 illustrates basic terms used in classification process like pattern (records, rows), attributes (dimensions, columns), class (output column) and class label (tag of class): RESEARCH ARTICLE OPEN ACCESS
  • 2. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 87 | P a g e TABLE -1: Terms Used in Classification Classification is a method of data mining for predicting the value for data instances by using previous experiences. Since we want to predict either a positive or a negative response, we will build a binary classification model. Classification is important because it helps scientists to clearly diagnose problems, study & observe them & organize concentrated conservation efforts. It also assists as a way of remembering & differentiating the types of symptoms, making predictions about diseases of the same type, classifying the relationship between different defects & providing precise names for diseases. 2.1 Applications of Classification Classification have several applications like Medical Diagnosis, Breast Cancer Diagnosis, Market Targeting, Image processing, Wine Classification, Solid Classification for selection of fertilizer, etc. III. CLASSIFICATION ALGORITHMS There is quite a lot of research on algorithms that classifies data. Several approaches have been developed for classification in data mining. Fig 2 shows hierarchy of classification algorithms: Fig. 2. Hierarchy of classification algorithms IV. NAÏVE BAYESIAN 4.1 Introduction to Naïve Bayesian The Naive Bayesian (NB) method is a simple probabilistic classifier based on Bayes Theorem (from Bayesian statistics) with strong (naive) independence premises which assumes that all the features are unique. NB model is easy to build, with no complicated iterative parameter estimation which makes it particularly useful for very large datasets [1]. In the NB classifiers, every feature can help determining which topic should be appointed to a given input value. To choose a topic for an input value, the naive Bayes classifier begins by evaluating the prior probability of each topic, which is determined by checking the frequency of each topic in the training set. The input from each feature is then mixed with this previous probability, to arrive at a probability estimate for each topic. If the estimated probability is the highest is then assigned to the testing inputs [5]. A supervised classifier is built on training corpora containing the correct topic for each input. The framework used by Bayesian classification is shown in Fig.3 Fig. 3. Bayesian classification (a) During the training, a feature extractor is used to convert each input value to a feature set. These feature sets, which capture the basic information about each input that should be used to classify it, are discussed in the next section. Pairs of feature sets & topics are fed into the machine learning algorithm to generate a model. (b) During the prediction, the same feature extractor is used to convert unseen inputs to feature sets. These feature sets are then fed into the model, which generates predicted topics [5]. 4.2 Advantages to Naive Bayesian 1. Fast to train & classify 2. Not sensitive to irrelevant features 3. Handles real & discrete data 4. Handles streaming data well 4.3 Disadvantages to Naive Bayesian 1. Assumes autonomy of aspects 2. Dependencies exist among variables (ex. Hospital: Patients, Diseases: Diabetes, Cancer) are not modelled by NB. V. K NEAREST NEIGHBOR 5.1 Introduction to K-Nearest Neighbor K nearest neighbor (KNN) is a simple method that stores all available cases & classifies new cases based on a similarity measure (e.g. euclidean). KNN has been used in statistical estimation & pattern
  • 3. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 88 | P a g e recognition already in the beginning of 1970’s as a non-parametric technique. The closest neighbor rule distinguishes the classification of unknown data point on the basis of its closest neighbor whose class is already known [4]. The training points are selected weights according to their distances from sample data point. But at the same time the computational complexity & memory requirements remain the essential things. To overcome from memory restriction size of dataset is minimized. For this the repeated patterns which don’t include additional data are also excluded from training data set. To further strengthen the information focuses which don’t influence the result are additionally eliminated from training data set [4]. The NN training dataset can be formed for utilizing different systems to boost over memory restriction of KNN. The KNN implementation can be done using ball tree, k-d tree, nearest feature line (NFL), principal axis search tree & orthogonal search tree. Next, the tree structured training data is divided into nodes & techniques like NFL & tunable metric divide the training data set according to planes. The speed of basic KNN algorithm can be increase by using these algorithms. Consider that an object is sampled with a set of different attributes. 5.2 Advantages to K-Nearest Neighbors 1. No assumptions about the characteristics of the concepts to learn have to be done 2. Complex concepts can be learned by local approximation using simple procedures 3. Very simple classifier that works well on basic recognition problems. 5.3 Disadvantages to K-Nearest Neighbors 1. The model cannot be interpreted 2. It is computationally expensive to find the KNN when the dataset is very large 3. It is a lazy learner; i.e. it does not learn anything from the training data & simply uses the training data itself for classification. VI. DECISION TREE 6.1 Introduction to Decision Tree A decision tree (DT) is a decision support tool that uses a tree-like graph or model of decisions & their possible effects, including chance event results, assets cost & utility. It is the only way to display an algorithm. The decision tree is a method for information portrayal evolved in the 60s. It can resolve the class label of test patterns by using set value of attribute. DT is a cycle free graph which has nodes as attributes to support decisions. The tree branch represents a precedence connection between the nodes [6]. The value of a branch is an element of the attribute value set of the branch’s parent node. The attributes are nodes with at least two children, because an attribute has got as many branches as the cardinality of the value set of the actual attribute. The root of the tree is the common ancestor attribute, from where the classification can be started. The leaves represent class nodes of the tree. In every relation the class is only a child, so it is a leaf of the tree in every case [6]. DT builds classification model in the form of a tree. It breaks a dataset into smaller subsets and an associated decision tree is incrementally developed. The final result is a tree with decision nodes & leaf nodes. A decision node has two or more branches and leaf node represents a decision. The topmost node also called as root node which corresponds to the best predictor. DT can handle both categorical & numerical data [4]. DT helps formalize the brainstorming process so we can identify more potential solutions. 6.2 Advantages to Decision Tree 1. Easy to interpretation. 2. Help determine worst, best & expected values for different scenarios. 6.3 Disadvantages to Decision Tree 1. Determination can get very complex if many values are ambiguous. VII. ARTIFICIAL NEURAL NETWORK 6.1 Introduction to Artificial Neural Network Artificial neural networks (ANNs) are types of computer architecture inspired by nervous systems of the brain & are used to approximate functions that can depend on a large number of inputs & are generally unknown. ANN are presented as systems of interconnected “neurons” which can compute values from inputs & are capable of machine learning as well as pattern recognition due their adaptive nature [4]. The brain basically learns from experience. It is natural proof that some problems that are beyond the scope of current computers are indeed solvable by small energy efficient packages. This brain modeling also promises a less technical way to develop machine solutions. This new approach to computing also provides a more graceful degradation during system overload than its more traditional counterparts [8].
  • 4. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 89 | P a g e A neural network is a massively parallel- distributed processor made up of simple processing units, which has a natural propensity for storing experimental knowledge & making it available for use. Neural network are also referred to in literature as neuro computers, connectionist networks, parallel-distributed processors, etc. A typical neural network is shown in the fig. 4.Where input, hidden & output layers are arranged in a feed forward manner [8]. The neurons are strongly interconnected & organized into different layers. The input layer receives the input & the output layer produces the final output. In general one or more hidden layers are sandwiched in between the two [4]. This structure makes it impossible to forecast or know the exact flow of data. Input Layer Hidden Layer Output Layer Fig. 4. A simple neural network ANN typically starts out with randomized weights for all their neurons. This means that initially they must be trained to solve the particular problem for which they are proposed. During the training period, we can evaluate whether the ANN’s output is correct by observing pattern. If it’s correct the neural weightings that produced that output are reinforced; if the output is incorrect, those weightings responsible can be diminished [4]. An ANN is useful in a variety of real-world applications such as visual pattern recognition, speech recognition and programs for text-to-speech that deal with complex often incomplete data. 6.2 Advantages to Artificial Neural Network 1. It is easy to use, with few parameters to adjust. 2. A neural network learns & reprogramming is not needed. 3. Applicable to a wide range of problems in real life. 6.3 Disadvantages to Artificial Neural Network 1. Requires high processing time if neural network is large. 2. Learning can be slow. VIII. SUPPORT VECTOR MACHINE 8.1 Introduction to Support Vector Machine A Support Vector Machine (SVM) performs classification by finding the hyperplane that maximizes the margin between the two classes. The vectors (cases) that define the hyperplane are the support vectors. The beauty of SVM is that if the data is linearly separable, there is a unique global minimum value. An ideal SVM analysis should produce a hyperplane that completely separates the vectors (cases) into two non-overlapping classes. However, perfect separation may not be possible, or it may result in a model with so many cases that the model does not classify correctly. In this situation SVM finds the hyperplane that maximizes the margin & minimizes the misclassifications [4].The algorithm tries to maintain the slack variable to zero while maximizing margin. However, it does not minimize the number of misclassifications (NP-complete problem) but the sum of distances from the margin hyperplanes [9]. The simplest way to separate two groups of data is with a straight line (1 dimension), flat plane (2 dimensions) or an N-dimensional hyperplane as shown in Fig. 5. Fig. 5. Hyperplane in SVM The main reason you would want to use an SVM instead of a Logistic Regression is because problem might not be linearly separable and if you are in a highly dimensional space [9]. 8.2 Advantages to Support Vector Machine 1. Most robust & accurate classification technique for binary problems. 2. Memory-intensive. 8.3 Disadvantages to Support Vector Machine 1. It can be painfully inefficient to train 2. High complexity & extensive memory requirements for classification in many cases. 3. Supports only binary classification.
  • 5. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 90 | P a g e IX. COMPARATIVE STUDY OF ALGORITHMS As per comparative analysis from Table 2 we can say that, ANN is better for data classification. Neural network allows learning from experiences & supports decision making, classification, Pattern recognition, etc. Neural networks often exhibit patterns similar to those exhibited by humans. However this is more of interest in cognitive sciences than for practical examples. TABLE -2: Comparative Study of Algorithms Algo. Approach Features Flaws NB Frequency Table  Simple to implement.  Great Computatio nal efficiency & classificatio n rate.  It predicts accurate results for most of the classificatio n & prediction problems.  The precision of algorithm decreases if the amount of data is less.  For obtaining good results it requires a very large number of records. KNN Similarity Function  Classes need not be linearly separable.  Zero cost of the learning process.  Well suited for multimodal classes.  Time to find the nearest Neighbors in a large training data set can be excessive.  Performance of algorithm depends on the number of dimensions used DT Frequency Table  It produces the more accuracy result than the C4.5 algorithm.  Detection rate is increase & space consumption is reduced.  Requires large searching time.  Sometimes it may generate very long rules which are very hard to prune.  Requires large amount of memory to store tree. ANN Others  It is easy to use & implement, with few parameters to adjust.  A neural network learns & reprogrammi ng is not needed.  Applicable to a wide range of problems in real life.  Requires high processing time if neural network is large.  Learning can be slow. SVM Others  High accuracy.  Work well even if data is not linearly separable in the base feature space.  Speed & size requiremen t both in training & testing is more.  High complexit y & extensive memory requireme nts for classificati on in many cases. X. EXPERIMENTAL ANALYSIS To examine all studied methods Lung Cancer & Iris benchmark datasets are used. Results are for 100% training & 100% testing scenario. Result analysis on the basis of accuracy is given in Table 3. Accuracy is calculated as: Total number of correctly classified data Accuracy = ------------------------------------------------- Total number of data TABLE -3: Result Analysis of Methods Using WEKA Tool Iris Lung Cancer NB 94.7% 97.4% KNN 94.0% 97.0% DT 95.0% 98.0% SVM 65.0% 98.5% ANN 98.3% 100%
  • 6. Shraddha Deshmukh Int. Journal of Engineering Research and Applications www.ijera.com ISSN: 2248-9622, Vol. 5, Issue 12, (Part - 3) December 2015, pp.86-91 www.ijera.com 91 | P a g e 0 20 40 60 80 100 NB KNN DT SVM ANN Iris Lung Cancer Fig 6: Chart of Result Analysis XI. CONCLUSION In this paper, we have studied five different classification methods based on approaches like, frequency tables, covariance matrix, similarity functions & others. Those algorithms are NB, KNN, DT, ANN & SVM. As per comparative study in between all algorithms we reached to conclusion that ANN is most suitable & efficient technique for Classification. ANNs are considered as simplified mathematical models of human brain & they function as parallel distributed computing networks. ANNs are universal function approximates, & usually they deliver good performance in applications. ANN has generalization ability as well as learnability. It is easy to use, implement & applicable to real world problems. There is a huge scope in this area of classification by using different methods of ANN like Fuzzy with ANN, Neuro-fuzzy, Genetic Approach, etc. in Artificial Neural Network. REFERENCES Journal Papers: [1] M Ozaki, Y. Adachi, Y. Iwahori, and N. Ishii, Application of fuzzy theory to writer recognition of Chinese characters, International Journal of Modeling and Simulation, 18(2), 1998, 112-116. [2] R. Andrews, J. Diederich & A. B., Tickle, “Survey & critique of techniques for extracting rules from trained artificial neural networks,” Knowledge Based System, vol. 8, no. 6, pp. 373-389, 1995. [3] Rashedur M. Rahman, Farhana Afroz, “Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis”, Journal of Software Engineering & Applications, 6, 85-97, 2013. [4] Wang Xin, Yu Hongliang, Zhang Lin, Huang Chaoming, Duan Jing, “Improved Naive Bayesian Classifier Method & the Application in Diesel Engine Valve Fault Diagnostic”, Third International Conference On Measuring Technology & Mechatronics Automation, 2011. [5] Sagar S. Nikam, “A Comparative Study Of Classification Techniques In Data Mining Algorithms”, Oriental Journal Of Computer Science & Technology, April 2015. [6] Zolboo Damiran, Khuder Altangerelt, “Text Classification Experiments On Mongolian Language”, IEEE Conference, Jul 2013. [7] Zolboo Damiran, Khuder Altangerel, “Author Identification: An Experiment based on Mongolian Literature using Decision Tree”, IEEE Conference, 2013. [8] Essaid el Haji, Abdellah Azmani, Mohm el Harzli, “A pairing individual-trades system , using KNN method”, IEEE Conference, 2014. [9] Kumar Abhishek, Abhay Kumar, Rajeev Ranjan, Sarthak K., “A Rainfall Prediction Model using Artificial Neural Network”, IEEE Conference, 2012. [10] Erlin, Unang Rio, Rahmiati, “Text Message Categorization of Collaborative Learning Skills in Online Discussion Using SVM”, IEEE Conference, 2013.