SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

A Comparative Study of CN2 Rule and SVM Algorithm and
Prediction of Heart Disease Datasets Using Clustering Algorithms
Ramaraj.M*1, Dr.Antony Selvadoss Thanamani*2
1

Research Scholar Department of Computer Science NGM College Pollachi India
ramaraj302@gmail.com.
2
Associate Professor & Head Department of Computer Science NGM College Pollachi
India
selvadoss@yahoo.com
Abstract
In this paper, we discuss diagnosis analysis and identification of heart disease using with data mining techniques.
The heart disease is a major cause of morbidity and mortality in modern society; it is extremely important but
complicated task that should be performed accurately and efficiently. It is an huge amount data of leads medical data
to the need for powerful data analysis tools are availability on the data mining technique. They have long to been an
concerned with applying for statistical and data mining tools and data mining techniques to improve data analysis on
large datasets. In this paper, to proposed system are implemented to find out the heart disease through as to
compared with the some data mining techniques are Decision tree, SOM, CN2 Rule and K-Means Clustering the
data mining could help in the identification or the prediction of high or low risk of Heart Disease.
Keywords: Data Mining, Heart Disease, Decision tree, SOM, CN2 Rule and cluster.

1. Introduction
1.1. Overview of data mining
The data mining concern of database technologists was to find efficient method of storing, retrieving and falsify
data, The main concern of the machine learning community was to develop this techniques for learning knowledge
form data, Data mining was a marriage between technologies developed in the database and machine learning
communities and the data mining can be considered to be an inter-disciplinary field involving from the databases to
access in data to be a data mining concept as learning algorithms, database technology, statistics ,mathematics,
clustering and visualization among others. It is existing real world data rather than data generated particularly for the
learning tasks [1]. In data mining the data sets are large therefore efficiency and scalability of algorithms is
important. A convention definition of KDD is given as follows:-data mining is the non trivial extraction of implicit
previously unknown and potentially useful information about data [2].These technology is to provides a user
oriented approach to novel and hidden patterns in the data.

1.2. Heart Disease
Heart disease or heart disease is the class of diseases that involve the heart or blood vessels(arteries and veins).now
today most countries face high and increasing rate of heart disease and it is become leading cause of debilitation and
death in the Worldwide for men and women. The heart disease is affected by the men and women for the age of fifty
to fifty-five years of the people to attack the heart disease.Harmonize to the World Health Organization report global
atlas on heart disease prevention and control states that Heart Disease(CVDs) are the leading causes of death and
disability in the world


A piece of information for heart diseases:



The number one cause of death globally as for CNDs;



They more die on every year from this disease and any other cause[3]. And an estimated for 20% to 25%
million people died from CVDs in 2011, representing 35% to 40% of all global death.

1
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org



In another diseases to estimate 7.3 million were due to coronary disease then 6.2 million were due to
stroke [4].



A large number of people who die from heart diseases and mainly from heart disease and stroke, will
increase to reach up to 23.3 to 30 million by 2030[5], CVDs are projected to remain the single leading
cause of death and most heart diseases can be prevented by addressing risk factors.



An 9.4 million deaths each year or 16.5% of all deaths can be attributed to high blood pressure[6]. It
includes 51% of deaths due to stroke and 45% of deaths to coronary heart disease and other 4% of deaths
due to other diseases and other problems of human body.



It includes 16.5% of death due to this heart disease in overall India. in the every year death percentage are
increasing the all over countries.



Heart diseases affected for more men and wemon and responsible for more then 40% of all deaths due to
united state.



It includes 2.7% of death due to heartdisease in overall countries.

Hazards of heart disease:
Hazards for heart disease as follows


Smoking.



High blood pressure.



Cholesterol.



Obesity.



Diabetes



Brith control pills.

Existing systems are implemented with the SVM, Rough Set Techniques, Association Rule Mining used on the
previous section to compare with this algorithm. Data mining could help in the identification or the prediction of
high and low level of accuracy and provides the accurate results.Two techniques are implemented with classification
tree, support vector mechanism to use on the previous paper. We proposed algorithms are implements with the
classification tree, CN2 Rule, SOM and K-Means clustering to compared with this algorithm.

2. Related works
A heart disease is to built with the aid of data mining techniques like Support Vector Mechanism, Decision Tree was
proposed to IJITEE (2012), they used on datasets in the heart disease. The heart disease using with datasets as for
more than 250 data and above will be using the databases. There be an 8 to 14 attributes are used and such as age,
sex, chest pain, cholesterol, fast blood pressure and etc…In the previous work on this paper, the result shows that
SVM gave the lowest classification accuracy of 77.78% of higher and using the dataset taken from the UCI machine
learning repository and the experimental result showed a correct classification accuracy of 77.56% with KNN. The
analysis of a paper is represent with the K-Means Clustering, CN2 Rule, SOM, visualization, and unsupervised
algorithms are used.
Python tool is used to classify the data and the data is evaluated using 2-fold cross validation and the results are
compared, Python tool is a data mining suite built around GUI algorithms and the main purpose of this tool is given

2
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

to researchers and students an easy to use data mining software as to allowing to analysis either real or synthetic
data.
To perform with the training dataset consists of 303 instances with 14 different attributes and data set is divided into
two parts that is 80% of data is a training data and 20% of data is testing and the result, it is clear that classification
accuracy of CN2 Rule, SOM algorithm is better compared to the algorithms. Heart disease and predicting system
using data mining techniques as distribution diagram, CN2 Rule and SOM is implemented in [7], as a data source of
data in a total number of records with the number of attributes from the heart disease in a database.CN2 Rule
appears to be most effective as it has highest percentage of correct predictions with heart disease and classification
trees, to be compared with the other models. In the K-Means clustering a large number of variables K-Means may be
computationally faster than hierarchical clustering, and may be produce tighter cluster than hierarchical clustering
and especially the cluster are globular [9].

3. Methodology
3.1. Classification Tree and Classification Algorithm
Heart disease or heart disease or coronary heart disease or ischemic heart disease [8] is a broad term that can refer to
any condition that affected the heart disease. Today, they have several studies to applying different techniques to
given problem and achieved high classification accuracies of 93.73% or higher and using the dataset taken from the
UCI machine learning repository. Then, they have highest percentage of correct classification accuracy is also
archived by 92.94%,86.14%,93.73% for the various algorithms produces these accuracy of high. The heart disease
affects people of all income levels as[10,11,12].

3.2. The K-Means Algorithms:
It is a simple iterative method to partition a given dataset into a specified large sum of clusters; K this algorithm has
been discovered by several researchers across the different disciplines. K-Means algorithms is one of the most
commonly used partitioning clustering algorithms, as the “K” in it is name refer to the fact that the algorithm looks
for a fixed number of clusters in terms of proximity of data points to each other. In this technique is iteration using
with two-dimensional flow charts. The algorithm is usually handling mony more to type of elements. The points of
corresponding to two elements of (x1, x2), the points correspond to n elements of the vectors (x1, x2… xn).

K-means algorithms
Where x, y is a set of two elements in the cluster and d(x,y) denote the min-distance between the two elements of x
and x.
 Complete linkage clustering
Its calculate the maximum distance between the large set of clusters. Formula as

….. (1)
where x, y is a set of two elements in the cluster and d(x, y) denote the max-distance between the two elements of x
and x.
Its calculate the minimum distance between the large set of clusters. Formula as

….. (2)
where x, y is a set of two elements in the cluster and d(x, y) denote the min-distance between the two elements of x
and x.

Euclidean distance
Euclidean distance are calculate the equation (3) will be used as

∑

‖

‖ ……….. (3)

where x, y is a set of two elements in the cluster of dis(x,y) and i=1 is a counter value of cluster
Presently, the feature vectors are grouped into k means clusters using a selected distance measure such as Euclidean
distance may be calculated .
The K-means and related algorithms gloss over the selection of K, there is no a priori reason to select a particular
value and there is really an outermost loop to these algorithms that occurs during analysis rather than in the
computer program [6]. The K-means clustering algorithm is the simplest and most commonly used algorithm it is

3
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

very sensitive to noise and original data points. Because a small number of data, such data can substantially
influence the mean value [13].

3.3. SOM Algorithm
SOM or SOFM is a type of artificial neural network that is trained using unsupervised learning to produce a typical
two dimensional(low-dimensional),discredited representation of the input space. SOMs useful for visualizing low
dimensional views to high dimensional view of data to akin the multidimensional scaling data. In this algorithm is
used for the analysis of heart disease using with data mining techniques in the unsupervised learning, it consists of
components called nodes or neurons. An associated with each node is a weight vector of the same dimension as the
input data vector and position in the vector space [14]. The nodes are arrangement is a two dimensional regular
space in a hexagonal or rectangular grid, SOMs describes a mapping from a high dimensional input space to a lower
dimensional map space. This type of network structure as related to feed forward networks where the nodes are
visualized as being attached and this type of architecture is fundamentally different in arrangement and motivation.

Figure 1: Distribution of SOMs visualizing map

3.4. CN2 Rule
CN2 rule to uses the likelihood ratio statistic (to developed by Clark and Niblett in year 1989) that measures the
difference between the class probability distribution in the training data sets, the CN2 rule is used for two levels of
controls are:


Order list



Unordered list

A default rule is to providing for majority class assignment as the final rule in the induced rule set. An ordered list of
rules the procedure looks for the most accurate rule in the current set of training data and rule predicts the most
frequent class in the set of covered data, CN2 rule finding the same rule again the all data sets are removed before a
new iteration is started at a top level. Until all the data’s are covered or no significant rule can be found, in
unordered list of data control procedure is iterate and inducing rules for each class in turn of only covered data
belonging to that class are removed, instead of removing all covered data[15].
CN2 rule are induced by the form ‘if <complex> then predicting <class>’ where <complex> as the definition of the
CN2 rule, it is iterative fashion of each iteration searching for a complex that covers a large number of data’s of
single class C and few of other classes. It must be both predictive and reliable to determined by CN2’s evaluation
function and CN2 uses the simple method of replacing unknown values with the most commonly occurring value for
that attribute in the training data.

Classification Tree Viewer:
4
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

Figure2: to identification of heart disease using with classification tree .

Figure3: attribute selection sex in the heart disease in a human.

3.5. Experiments and Classification Accuracy Table:

Table1: compared with various algorithms and distribute the classification accuracy table on heart disease datasets.
Table 1, we can calculate the classification accuracy in the original dataset as used with the heart disease in a data
base and find out the accuracy equation as

∑

……………. (4).

wherex1, x2 is set elements in the positive prediction and negative prediction as the dataset in a heart disease and
i=1 is a counter value of data in table and xiis a original data in the table.

3.6. Classification accuracy chart
5
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

Figure 4: chart represented with classification accuracy of heart disease and various algorithms are implementd .
4. Conclusion:
In this paper , the goal was to design a predictive model for heart disease detection using data mining techniques
from analysis the report for the classification accuracy among these data mining techniques has discussed, the result
shows the difference in error rates. It is used with differences in different techniques as using to the classification
tree and CN2 Rule to perform classification and more accurately than the other methods. The result of classification
accuracy for 93.73% and missing error data as 6.27% or the incorrect data to the heart disease.
In the future work, we will try to increase the more accuracy for heart disease patient by increasing the different
parameters and will be compared with different algorithms using with data mining techniques.

References
[1] data mining concept:-cgi.di.uao.gr/~rouvas/research/white.html.
[2] Frawley and Piatetsky-Shapiro, 1996. Knowledge Discovery in Databases: An Overview. The AAAI/MIT Press,
Menlo Park, C.A.
[3] Global status report on noncommunicabledisaeses 2010. Geneva, World Health Organization, 2011.
[4] Global atlas on heart disease prevention and control. Geneva, World Health Organization, 2011.
[5] Mathers CD, Loncar D. Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med,
2006, 3(11):e442.
[6] Lim SS, Vos T, Flaxman AD, Danaei G, Shibuya K, Adair-Rohani H et al. A comparative risk assessment of
burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990-2010: a
systematic analysis for the Global Burden of Disease Study 2010. Lancet, 2012, 380(9859):2224–2260.
[7] sellappapalaniapparafiahswang,intelligent heart disease prediction system using data mining techniques
,IJCSNS,vol 8 no 8 aug 2011.
[8] BalaSundar V, Bharathiar, ―Development of a Data Clustering Algorithm for Predicting Heart‖ International
Journal of Computer Applications (0975 – 888) Volume 48– No.7, June 2012
[9] http://heart-disease.emedtv.com/ coronary-artery-disease/coronary-artery-disease.html
[10] R. Gupta, V. P. Gupta, and N. S. Ahluwalia, ―Educational status, coronary heart disease, and coronary risk
factor prevalence in a rural population of India‖, BMJ. pp 1332–1336, 19 November 2012.
[11] A Mathavan, MD, A Chockalingam, PhD, S Chockalingam, BSc, B Bilchik, MD, and V Saini, MD,
―Madurai Area Physicians Heart Health Evaluation Survey (MAPCHES) – an alarming status‖, The Canadian
Journal of Cardiology; 25(5): 303–308, May 2009.

6
Network and Complex Systems
ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)
Vol.3, No.10, 2013

www.iiste.org

[12] Rajeswari K, Vaithiyanathan V, P. Amirtharaj ―Application of Decision Tree Classifiers in Diagnosing Heart
Disease using Demographic Data‖ American Journal of Scientific research ISSN 2301-2005 pp. 77-82 EuroJournals
Publishing, 2012.

[13] Mrs. Bharati M. Ramageri, ―Data Mining Techniques And Applications‖,Bharati M. Ramageri / Indian
Journal of Computer Science and Engineering Vol. 1 No. 4 301-305
[14] ^ a b Ultsch, Alfred (2003); U*-Matrix: A tool to visualize clusters in high dimensional data, Department of
Computer Science, University of Marburg, Technical Report Nr. 36:1-12
[15] D. Gamberger, N. Lavraˇc, F. ˇ Zelezn´y, and J. Tolar. Induction of comprehensible models for geneexpression
datasets by subgroup discovery methodology. JrBiomedical Informatics, 37(5):269–284, 2004

7
This academic article was published by The International Institute for Science,
Technology and Education (IISTE). The IISTE is a pioneer in the Open Access
Publishing service based in the U.S. and Europe. The aim of the institute is
Accelerating Global Knowledge Sharing.
More information about the publisher can be found in the IISTE’s homepage:
http://www.iiste.org
CALL FOR JOURNAL PAPERS
The IISTE is currently hosting more than 30 peer-reviewed academic journals and
collaborating with academic institutions around the world. There’s no deadline for
submission. Prospective authors of IISTE journals can find the submission
instruction on the following page: http://www.iiste.org/journals/
The IISTE
editorial team promises to the review and publish all the qualified submissions in a
fast manner. All the journals articles are available online to the readers all over the
world without financial, legal, or technical barriers other than those inseparable from
gaining access to the internet itself. Printed version of the journals is also available
upon request of readers and authors.
MORE RESOURCES
Book publication information: http://www.iiste.org/book/
Recent conferences: http://www.iiste.org/conference/
IISTE Knowledge Sharing Partners
EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open
Archives Harvester, Bielefeld Academic Search Engine, Elektronische
Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial
Library , NewJour, Google Scholar

Weitere ähnliche Inhalte

Was ist angesagt?

Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
Sivagowry Shathesh
 

Was ist angesagt? (20)

Heart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning AlgorithmHeart Disease Prediction using Machine Learning Algorithm
Heart Disease Prediction using Machine Learning Algorithm
 
prediction of heart disease using machine learning algorithms
prediction of heart disease using machine learning algorithmsprediction of heart disease using machine learning algorithms
prediction of heart disease using machine learning algorithms
 
Chronic Kidney Disease Prediction
Chronic Kidney Disease PredictionChronic Kidney Disease Prediction
Chronic Kidney Disease Prediction
 
Psdot 14 using data mining techniques in heart
Psdot 14 using data mining techniques in heartPsdot 14 using data mining techniques in heart
Psdot 14 using data mining techniques in heart
 
Using AI to Predict Strokes
Using AI to Predict StrokesUsing AI to Predict Strokes
Using AI to Predict Strokes
 
Heart disease prediction
Heart disease predictionHeart disease prediction
Heart disease prediction
 
IRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction SystemIRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction System
 
A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
 
08. 9804 11737-1-rv edit dhyan
08. 9804 11737-1-rv edit dhyan08. 9804 11737-1-rv edit dhyan
08. 9804 11737-1-rv edit dhyan
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseA Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction system
 
A novel methodology for diagnosing the heart disease using fuzzy database
A novel methodology for diagnosing the heart disease using fuzzy databaseA novel methodology for diagnosing the heart disease using fuzzy database
A novel methodology for diagnosing the heart disease using fuzzy database
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
 
Data mining techniques on heart failure diagnosis
Data mining techniques on heart failure diagnosisData mining techniques on heart failure diagnosis
Data mining techniques on heart failure diagnosis
 
Health care analytics
Health care analyticsHealth care analytics
Health care analytics
 
Prediction of Heart Disease Using Data Mining Techniques- A Review
Prediction of Heart Disease Using Data Mining Techniques- A ReviewPrediction of Heart Disease Using Data Mining Techniques- A Review
Prediction of Heart Disease Using Data Mining Techniques- A Review
 
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET-  	  Role of Different Data Mining Techniques for Predicting Heart DiseaseIRJET-  	  Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
 

Ähnlich wie A comparative study of cn2 rule and svm algorithm

An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
BASMAJUMAASALEHALMOH
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
IAEME Publication
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
IAEME Publication
 
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCAREK-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
International Journal of Technical Research & Application
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397
Editor IJARCET
 
V5_I2_2016_Paper11.docx
V5_I2_2016_Paper11.docxV5_I2_2016_Paper11.docx
V5_I2_2016_Paper11.docx
IIRindia
 

Ähnlich wie A comparative study of cn2 rule and svm algorithm (20)

A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...
 
Hybrid CNN and LSTM Network For Heart Disease Prediction
Hybrid CNN and LSTM Network For Heart Disease PredictionHybrid CNN and LSTM Network For Heart Disease Prediction
Hybrid CNN and LSTM Network For Heart Disease Prediction
 
Comparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease Prediction
 
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
 
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
 
IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...
IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...
IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...
 
Heart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierHeart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means Classifier
 
IRJET - Comparative Study of Cardiovascular Disease Detection Algorithms
IRJET - Comparative Study of Cardiovascular Disease Detection AlgorithmsIRJET - Comparative Study of Cardiovascular Disease Detection Algorithms
IRJET - Comparative Study of Cardiovascular Disease Detection Algorithms
 
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
 
A STUDY OF THE LITERATURE ON CARDIOVASCULAR DISEASE PREDICTION METHODS
A STUDY OF THE LITERATURE ON CARDIOVASCULAR DISEASE PREDICTION METHODSA STUDY OF THE LITERATURE ON CARDIOVASCULAR DISEASE PREDICTION METHODS
A STUDY OF THE LITERATURE ON CARDIOVASCULAR DISEASE PREDICTION METHODS
 
HEART DISEASE PREDICTION RANDOM FOREST ALGORITHMS
HEART DISEASE PREDICTION RANDOM FOREST ALGORITHMSHEART DISEASE PREDICTION RANDOM FOREST ALGORITHMS
HEART DISEASE PREDICTION RANDOM FOREST ALGORITHMS
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
 
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCAREK-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
 
A hybrid approach to medical decision-making: diagnosis of heart disease wit...
A hybrid approach to medical decision-making: diagnosis of  heart disease wit...A hybrid approach to medical decision-making: diagnosis of  heart disease wit...
A hybrid approach to medical decision-making: diagnosis of heart disease wit...
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397
 
Heart disease classification using Random Forest
Heart disease classification using Random ForestHeart disease classification using Random Forest
Heart disease classification using Random Forest
 
IRJET- A Survey on Prediction of Heart Disease Presence using Data Mining and...
IRJET- A Survey on Prediction of Heart Disease Presence using Data Mining and...IRJET- A Survey on Prediction of Heart Disease Presence using Data Mining and...
IRJET- A Survey on Prediction of Heart Disease Presence using Data Mining and...
 
Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...
 
V5_I2_2016_Paper11.docx
V5_I2_2016_Paper11.docxV5_I2_2016_Paper11.docx
V5_I2_2016_Paper11.docx
 

Mehr von Alexander Decker

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...
Alexander Decker
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websites
Alexander Decker
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banks
Alexander Decker
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized d
Alexander Decker
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistance
Alexander Decker
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
Alexander Decker
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibia
Alexander Decker
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school children
Alexander Decker
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banks
Alexander Decker
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget for
Alexander Decker
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjab
Alexander Decker
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...
Alexander Decker
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incremental
Alexander Decker
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniques
Alexander Decker
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloud
Alexander Decker
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveraged
Alexander Decker
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenya
Alexander Decker
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health of
Alexander Decker
 

Mehr von Alexander Decker (20)

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...
 
A validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale inA validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale in
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websites
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banks
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized d
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistance
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibia
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school children
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banks
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget for
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjab
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incremental
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniques
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloud
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveraged
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenya
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health of
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

A comparative study of cn2 rule and svm algorithm

  • 1. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org A Comparative Study of CN2 Rule and SVM Algorithm and Prediction of Heart Disease Datasets Using Clustering Algorithms Ramaraj.M*1, Dr.Antony Selvadoss Thanamani*2 1 Research Scholar Department of Computer Science NGM College Pollachi India ramaraj302@gmail.com. 2 Associate Professor & Head Department of Computer Science NGM College Pollachi India selvadoss@yahoo.com Abstract In this paper, we discuss diagnosis analysis and identification of heart disease using with data mining techniques. The heart disease is a major cause of morbidity and mortality in modern society; it is extremely important but complicated task that should be performed accurately and efficiently. It is an huge amount data of leads medical data to the need for powerful data analysis tools are availability on the data mining technique. They have long to been an concerned with applying for statistical and data mining tools and data mining techniques to improve data analysis on large datasets. In this paper, to proposed system are implemented to find out the heart disease through as to compared with the some data mining techniques are Decision tree, SOM, CN2 Rule and K-Means Clustering the data mining could help in the identification or the prediction of high or low risk of Heart Disease. Keywords: Data Mining, Heart Disease, Decision tree, SOM, CN2 Rule and cluster. 1. Introduction 1.1. Overview of data mining The data mining concern of database technologists was to find efficient method of storing, retrieving and falsify data, The main concern of the machine learning community was to develop this techniques for learning knowledge form data, Data mining was a marriage between technologies developed in the database and machine learning communities and the data mining can be considered to be an inter-disciplinary field involving from the databases to access in data to be a data mining concept as learning algorithms, database technology, statistics ,mathematics, clustering and visualization among others. It is existing real world data rather than data generated particularly for the learning tasks [1]. In data mining the data sets are large therefore efficiency and scalability of algorithms is important. A convention definition of KDD is given as follows:-data mining is the non trivial extraction of implicit previously unknown and potentially useful information about data [2].These technology is to provides a user oriented approach to novel and hidden patterns in the data. 1.2. Heart Disease Heart disease or heart disease is the class of diseases that involve the heart or blood vessels(arteries and veins).now today most countries face high and increasing rate of heart disease and it is become leading cause of debilitation and death in the Worldwide for men and women. The heart disease is affected by the men and women for the age of fifty to fifty-five years of the people to attack the heart disease.Harmonize to the World Health Organization report global atlas on heart disease prevention and control states that Heart Disease(CVDs) are the leading causes of death and disability in the world  A piece of information for heart diseases:  The number one cause of death globally as for CNDs;  They more die on every year from this disease and any other cause[3]. And an estimated for 20% to 25% million people died from CVDs in 2011, representing 35% to 40% of all global death. 1
  • 2. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org  In another diseases to estimate 7.3 million were due to coronary disease then 6.2 million were due to stroke [4].  A large number of people who die from heart diseases and mainly from heart disease and stroke, will increase to reach up to 23.3 to 30 million by 2030[5], CVDs are projected to remain the single leading cause of death and most heart diseases can be prevented by addressing risk factors.  An 9.4 million deaths each year or 16.5% of all deaths can be attributed to high blood pressure[6]. It includes 51% of deaths due to stroke and 45% of deaths to coronary heart disease and other 4% of deaths due to other diseases and other problems of human body.  It includes 16.5% of death due to this heart disease in overall India. in the every year death percentage are increasing the all over countries.  Heart diseases affected for more men and wemon and responsible for more then 40% of all deaths due to united state.  It includes 2.7% of death due to heartdisease in overall countries. Hazards of heart disease: Hazards for heart disease as follows  Smoking.  High blood pressure.  Cholesterol.  Obesity.  Diabetes  Brith control pills. Existing systems are implemented with the SVM, Rough Set Techniques, Association Rule Mining used on the previous section to compare with this algorithm. Data mining could help in the identification or the prediction of high and low level of accuracy and provides the accurate results.Two techniques are implemented with classification tree, support vector mechanism to use on the previous paper. We proposed algorithms are implements with the classification tree, CN2 Rule, SOM and K-Means clustering to compared with this algorithm. 2. Related works A heart disease is to built with the aid of data mining techniques like Support Vector Mechanism, Decision Tree was proposed to IJITEE (2012), they used on datasets in the heart disease. The heart disease using with datasets as for more than 250 data and above will be using the databases. There be an 8 to 14 attributes are used and such as age, sex, chest pain, cholesterol, fast blood pressure and etc…In the previous work on this paper, the result shows that SVM gave the lowest classification accuracy of 77.78% of higher and using the dataset taken from the UCI machine learning repository and the experimental result showed a correct classification accuracy of 77.56% with KNN. The analysis of a paper is represent with the K-Means Clustering, CN2 Rule, SOM, visualization, and unsupervised algorithms are used. Python tool is used to classify the data and the data is evaluated using 2-fold cross validation and the results are compared, Python tool is a data mining suite built around GUI algorithms and the main purpose of this tool is given 2
  • 3. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org to researchers and students an easy to use data mining software as to allowing to analysis either real or synthetic data. To perform with the training dataset consists of 303 instances with 14 different attributes and data set is divided into two parts that is 80% of data is a training data and 20% of data is testing and the result, it is clear that classification accuracy of CN2 Rule, SOM algorithm is better compared to the algorithms. Heart disease and predicting system using data mining techniques as distribution diagram, CN2 Rule and SOM is implemented in [7], as a data source of data in a total number of records with the number of attributes from the heart disease in a database.CN2 Rule appears to be most effective as it has highest percentage of correct predictions with heart disease and classification trees, to be compared with the other models. In the K-Means clustering a large number of variables K-Means may be computationally faster than hierarchical clustering, and may be produce tighter cluster than hierarchical clustering and especially the cluster are globular [9]. 3. Methodology 3.1. Classification Tree and Classification Algorithm Heart disease or heart disease or coronary heart disease or ischemic heart disease [8] is a broad term that can refer to any condition that affected the heart disease. Today, they have several studies to applying different techniques to given problem and achieved high classification accuracies of 93.73% or higher and using the dataset taken from the UCI machine learning repository. Then, they have highest percentage of correct classification accuracy is also archived by 92.94%,86.14%,93.73% for the various algorithms produces these accuracy of high. The heart disease affects people of all income levels as[10,11,12]. 3.2. The K-Means Algorithms: It is a simple iterative method to partition a given dataset into a specified large sum of clusters; K this algorithm has been discovered by several researchers across the different disciplines. K-Means algorithms is one of the most commonly used partitioning clustering algorithms, as the “K” in it is name refer to the fact that the algorithm looks for a fixed number of clusters in terms of proximity of data points to each other. In this technique is iteration using with two-dimensional flow charts. The algorithm is usually handling mony more to type of elements. The points of corresponding to two elements of (x1, x2), the points correspond to n elements of the vectors (x1, x2… xn). K-means algorithms Where x, y is a set of two elements in the cluster and d(x,y) denote the min-distance between the two elements of x and x.  Complete linkage clustering Its calculate the maximum distance between the large set of clusters. Formula as ….. (1) where x, y is a set of two elements in the cluster and d(x, y) denote the max-distance between the two elements of x and x. Its calculate the minimum distance between the large set of clusters. Formula as ….. (2) where x, y is a set of two elements in the cluster and d(x, y) denote the min-distance between the two elements of x and x. Euclidean distance Euclidean distance are calculate the equation (3) will be used as ∑ ‖ ‖ ……….. (3) where x, y is a set of two elements in the cluster of dis(x,y) and i=1 is a counter value of cluster Presently, the feature vectors are grouped into k means clusters using a selected distance measure such as Euclidean distance may be calculated . The K-means and related algorithms gloss over the selection of K, there is no a priori reason to select a particular value and there is really an outermost loop to these algorithms that occurs during analysis rather than in the computer program [6]. The K-means clustering algorithm is the simplest and most commonly used algorithm it is 3
  • 4. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org very sensitive to noise and original data points. Because a small number of data, such data can substantially influence the mean value [13]. 3.3. SOM Algorithm SOM or SOFM is a type of artificial neural network that is trained using unsupervised learning to produce a typical two dimensional(low-dimensional),discredited representation of the input space. SOMs useful for visualizing low dimensional views to high dimensional view of data to akin the multidimensional scaling data. In this algorithm is used for the analysis of heart disease using with data mining techniques in the unsupervised learning, it consists of components called nodes or neurons. An associated with each node is a weight vector of the same dimension as the input data vector and position in the vector space [14]. The nodes are arrangement is a two dimensional regular space in a hexagonal or rectangular grid, SOMs describes a mapping from a high dimensional input space to a lower dimensional map space. This type of network structure as related to feed forward networks where the nodes are visualized as being attached and this type of architecture is fundamentally different in arrangement and motivation. Figure 1: Distribution of SOMs visualizing map 3.4. CN2 Rule CN2 rule to uses the likelihood ratio statistic (to developed by Clark and Niblett in year 1989) that measures the difference between the class probability distribution in the training data sets, the CN2 rule is used for two levels of controls are:  Order list  Unordered list A default rule is to providing for majority class assignment as the final rule in the induced rule set. An ordered list of rules the procedure looks for the most accurate rule in the current set of training data and rule predicts the most frequent class in the set of covered data, CN2 rule finding the same rule again the all data sets are removed before a new iteration is started at a top level. Until all the data’s are covered or no significant rule can be found, in unordered list of data control procedure is iterate and inducing rules for each class in turn of only covered data belonging to that class are removed, instead of removing all covered data[15]. CN2 rule are induced by the form ‘if <complex> then predicting <class>’ where <complex> as the definition of the CN2 rule, it is iterative fashion of each iteration searching for a complex that covers a large number of data’s of single class C and few of other classes. It must be both predictive and reliable to determined by CN2’s evaluation function and CN2 uses the simple method of replacing unknown values with the most commonly occurring value for that attribute in the training data. Classification Tree Viewer: 4
  • 5. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org Figure2: to identification of heart disease using with classification tree . Figure3: attribute selection sex in the heart disease in a human. 3.5. Experiments and Classification Accuracy Table: Table1: compared with various algorithms and distribute the classification accuracy table on heart disease datasets. Table 1, we can calculate the classification accuracy in the original dataset as used with the heart disease in a data base and find out the accuracy equation as ∑ ……………. (4). wherex1, x2 is set elements in the positive prediction and negative prediction as the dataset in a heart disease and i=1 is a counter value of data in table and xiis a original data in the table. 3.6. Classification accuracy chart 5
  • 6. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org Figure 4: chart represented with classification accuracy of heart disease and various algorithms are implementd . 4. Conclusion: In this paper , the goal was to design a predictive model for heart disease detection using data mining techniques from analysis the report for the classification accuracy among these data mining techniques has discussed, the result shows the difference in error rates. It is used with differences in different techniques as using to the classification tree and CN2 Rule to perform classification and more accurately than the other methods. The result of classification accuracy for 93.73% and missing error data as 6.27% or the incorrect data to the heart disease. In the future work, we will try to increase the more accuracy for heart disease patient by increasing the different parameters and will be compared with different algorithms using with data mining techniques. References [1] data mining concept:-cgi.di.uao.gr/~rouvas/research/white.html. [2] Frawley and Piatetsky-Shapiro, 1996. Knowledge Discovery in Databases: An Overview. The AAAI/MIT Press, Menlo Park, C.A. [3] Global status report on noncommunicabledisaeses 2010. Geneva, World Health Organization, 2011. [4] Global atlas on heart disease prevention and control. Geneva, World Health Organization, 2011. [5] Mathers CD, Loncar D. Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med, 2006, 3(11):e442. [6] Lim SS, Vos T, Flaxman AD, Danaei G, Shibuya K, Adair-Rohani H et al. A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet, 2012, 380(9859):2224–2260. [7] sellappapalaniapparafiahswang,intelligent heart disease prediction system using data mining techniques ,IJCSNS,vol 8 no 8 aug 2011. [8] BalaSundar V, Bharathiar, ―Development of a Data Clustering Algorithm for Predicting Heart‖ International Journal of Computer Applications (0975 – 888) Volume 48– No.7, June 2012 [9] http://heart-disease.emedtv.com/ coronary-artery-disease/coronary-artery-disease.html [10] R. Gupta, V. P. Gupta, and N. S. Ahluwalia, ―Educational status, coronary heart disease, and coronary risk factor prevalence in a rural population of India‖, BMJ. pp 1332–1336, 19 November 2012. [11] A Mathavan, MD, A Chockalingam, PhD, S Chockalingam, BSc, B Bilchik, MD, and V Saini, MD, ―Madurai Area Physicians Heart Health Evaluation Survey (MAPCHES) – an alarming status‖, The Canadian Journal of Cardiology; 25(5): 303–308, May 2009. 6
  • 7. Network and Complex Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online) Vol.3, No.10, 2013 www.iiste.org [12] Rajeswari K, Vaithiyanathan V, P. Amirtharaj ―Application of Decision Tree Classifiers in Diagnosing Heart Disease using Demographic Data‖ American Journal of Scientific research ISSN 2301-2005 pp. 77-82 EuroJournals Publishing, 2012. [13] Mrs. Bharati M. Ramageri, ―Data Mining Techniques And Applications‖,Bharati M. Ramageri / Indian Journal of Computer Science and Engineering Vol. 1 No. 4 301-305 [14] ^ a b Ultsch, Alfred (2003); U*-Matrix: A tool to visualize clusters in high dimensional data, Department of Computer Science, University of Marburg, Technical Report Nr. 36:1-12 [15] D. Gamberger, N. Lavraˇc, F. ˇ Zelezn´y, and J. Tolar. Induction of comprehensible models for geneexpression datasets by subgroup discovery methodology. JrBiomedical Informatics, 37(5):269–284, 2004 7
  • 8. This academic article was published by The International Institute for Science, Technology and Education (IISTE). The IISTE is a pioneer in the Open Access Publishing service based in the U.S. and Europe. The aim of the institute is Accelerating Global Knowledge Sharing. More information about the publisher can be found in the IISTE’s homepage: http://www.iiste.org CALL FOR JOURNAL PAPERS The IISTE is currently hosting more than 30 peer-reviewed academic journals and collaborating with academic institutions around the world. There’s no deadline for submission. Prospective authors of IISTE journals can find the submission instruction on the following page: http://www.iiste.org/journals/ The IISTE editorial team promises to the review and publish all the qualified submissions in a fast manner. All the journals articles are available online to the readers all over the world without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. Printed version of the journals is also available upon request of readers and authors. MORE RESOURCES Book publication information: http://www.iiste.org/book/ Recent conferences: http://www.iiste.org/conference/ IISTE Knowledge Sharing Partners EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open Archives Harvester, Bielefeld Academic Search Engine, Elektronische Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial Library , NewJour, Google Scholar