SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1281
SURVEY ON ANALYSIS OF BREAST CANCER PREDICTION
Lingaraj N1, Krishna Kumar S1, Banu Priya K1, R.M. Shiny2
1UG Student, Department of Computer Science and Engineering, Agni College of Technology, Tamil Nadu, India
2Assistant Professor, Department of Computer Science and Engineering, Agni College of Technology,
Tamil Nadu, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Breast cancer is one of the major threats inmiddle
aged women throughout the world. In today’s world, this is
the second most threatening cause of death in women. Early
diagnosis can significantly reduce the chances of death. But it
is not an easy due to several uncertainties in detection.
Machine Learning techniques areusedtodevelop atoolwhich
is used as an effective mechanism for early detection and
diagnosis of breast cancer. Early diagnosis is helpful for
physicians which will greatly enhance the survival rate of
cancer patients. This paper compares three of the most
popular ML techniques which are used for breast cancer
detection and diagnosis. The techniques are Support Vector
Machine (SVM), Random Forest (RF) and Naive Bayes (NB).
The probability will be calculated for each of thesetechniques
and the algorithm with highest probability provides themore
accurate results.
Key Words: BreastCancer, MachineLearning, Detection,
Diagnosis, Prediction, Analysis.
1. INTRODUCTION
Cancer is a heterogeneous disease that may be divided
into many types. Accordingto WorldHealthOrganization[1],
twenty five percent of the females are diagnosed with
breasts cancer at some stage in their life. In UAE[2], forty
third of feminine cancer patients are diagnosed with breast
cancer. Accurately predicting a cancerous growth remainsa
difficult task for several physicians. The emergence of
latest medical technologies and therefore the monumental
quantity of patient information have impelled the
trail for the event of latest methods within the prediction
and detection of cancer. Recurrent breast cancer is the one
which comes back in the same or opposite breast or chest
wall after a time period when the cancer couldn't be
detected. Though information assessment that is collected
from the patient and a physicians intake greatly contributes
to the diagnostic method, supportive tools could be
superimposed to assist facilitate proper diagnoses. These
tools aim to eliminate possible diagnostic errors and
supply a quick method for analyzing the large chunks of
Data.
1.1 Machine Learning:
Machine Learning (ML), isa subfieldof Artificial Intelligence
(AI) that permits machines to learn without explicit
programming by exposing them to sets of
information permitting them to learn a selected task
through expertise. Over the previous few decades, machine
learning strategies have been widespread within
the development of predictive models so as to support
effective decision-making. In cancer analysis, these
techniques could be used to determine completely
different patterns during a information set and
consequently predict whether or not a cancer is malignant
or benign. The performance of such techniques will be
evaluated supported theaccuracyoftheclassification,recall,
precision, and therefore the space underneath ROC.
1.2 Data Preprocessing
Recently, data processing has become a well-liked
economical tool for data discovery and extracting hidden
patterns from massive datasets. It involves theemployment
of refined knowledge manipulationtools togetantecedently
unknown, valid patterns and relationships
in massive dataset. We tend to apply 3 robust data
processing classificationalgorithms i.e. SVM,Randomforest
and Naive Bayes, a medium sized knowledge set that
contained thirty five attributes and 198 cancer patient
data.
2. MACHINE LEARNING TECHNIQUES
The learning method in ML techniques are often divided
into 2 main categories, supervised and unsupervised
learning. In supervised learning, a set of data instances
are used to train the machine and are labeled to grant the
right result. However, in unsupervised learning, there are
no pre-determined knowledge sets and no notion of the
expected outcome, whichsuggests thatthegoal istougherto
achieve.
2.1. Support Vector Machine:
Support Vector Machine is one of the supervised machine
learning classification techniques that is widelyappliedin
the field of cancer identification and prognosis. SVM is
functioned by choosing the critical samples from all
categories. These samples are called support vectors.These
classes are separated by generating a linear function that
divides them broadly as possible using these support
vectors.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1282
Fig -1: SVM generated hyper-planes
2.2. Random Forest:
RF brings along many decision trees to ensemble a forestof
trees. The RF methodology relies on an
algorithmic approach within which each iteration involves
choosing one random sample of size N from the data set
with replacement, and another random sample from the
predictors without replacement. Then thedataobtained
is partitioned. The other data is then dropped. These steps
are repeated certain times depending on how
many trees are needed. Finally, a count is made over the
trees that classify the observation in one category.
Cases are then classified based on a majority vote over the
decision tree.
Fig -2: Working of Random Forest
2.3. Naive Bayes:
Naive Bayes is a subfield of probabilistic graphical models
which is used for prediction [3] and representation of
knowledge in uncertain domains. Naïve Bayes corresponds
to a widely used structurein machinelearningknownas the
directed acyclic graph (DAG). This graph consists of many
nodes, each equivalent to a variable and the node edges
represents direct dependence among the corresponding
nodes in the graph.
Fig -3: Breast cancer attributes – DAG Model
3. SIMULATION SETUP:
3.1. Data Set:
Our investigation is based on the initial Wisconsin breast
cancer data set that's obtained from the UCI Machine
Learning Repository [5] , an online open source repository.
This data set was collected over 3 years from the University
of Wisconsin Hospitals by Dr. William H. Wolberg and it
consists of 669 instances, where the classification results
are either malignant or benign. There were 458 benign
cases and 241 was malignant. The ten attributes used are: •
Clump Thickness • Cell Size Uniformity • Cell shape
Uniformity • Marginal Adhesion • Single epithelial cell Size•
Bare Nuclei • Bland chromatin • normal Nuclei • Mitoses
• class.
3.2. Training set:
The classifiers are tested using the k – fold cross validation
methodology. This validation technique can
randomly separate the training set into k
subsets where one of the k-1 subsets are used for
testing and the rest for training. 10-fold cross-validation is
the preferred k value utilized in most validation in ML and
will be used in this paper. This implies nine subsets will
be used for training of the classifier and the
remaining one for the testing. This system is used to avoid
over fitting of the training set, which is likely to occur in
small data sets and large number of attributes.
Fig -3: 10- k cross validation method
3.3. Simulation software:
Here, WEKA - Waikato Environment for Knowledge
Analysis [4] software is used as a Machine Learning tool.
WEKA is a Java based open supply tool that was initially
released to the public in 2006 undertheGNU General Public
License. This tool provides many ML techniques and
algorithms as well as the classification techniques that
are being investigated in this paper. Alternative features
include data preprocessing, clustering, feature selection
evaluation and rule discovery algorithms. Datasets are
accepted in many formats, like CSV and ARFF. Besides
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1283
being an open source tool, WEKA is additionally attractive
due to its portability and ease of use GUI.
4. RESULT:
This section describes the parameters and presents the
results that assist the 3 classifiers that are being
investigated in this paper.
4.1. Accuracy:
The classifier accuracy is a measure of how well the
classifier will properly predict cases into their correct
category. It is the number of correct predictions divided
by the total number of instances within the data set. It is
worth noting that the accuracy is highly dependent on the
threshold chosen by the classifier and may therefore
change for various testing sets. Hence, accuracy may
be calculated using the subsequent equation:
4.2. Recall:
Recall, also commonly referredtoas sensitivity, isthe rateof
the positive observations that are correctly predicted as
positive. This measure is desirable, particularly in
the medical field because how many of the
observations are properly diagnosed. In this study, it
is more important to properly identify a malignant
tumor than it is to incorrectly identify a benign one.
Table-1: Recall values comparison
4.3. Precision:
Precision, also commonly called confidence, is the rate
of both true positives and true negatives that are identified
as true positives. This shows how well the classifierhandles
the positive observations but doesn't say
much regarding the negative ones. The precision valuesfor
all 3 techniques are shown in Table-2:
Table-2: Precision values comparison
4.4. ROC area:
A receiver operating characteristics (ROC) graph is a way
to visualize a classifiers performance by showing thetrade-
off between the price and advantage of that
classifier. ROC is one amongst the foremost common and
useful performance measure fordata processingtechniques.
Percentage values for the roc area of all 3
techniques are shown in Table-3:
Table-3: ROC values
5. CONCLUSION
ML techniques are widely used in the medical field and
have served as a useful diagnostic tool that helps physicians
in analyzing the available data as well as designing medical
expert systems. This paper saysaboutthreeof the most
common ML techniques which are normally used
for breast cancer detection and diagnosis. They
are Support Vector Machine (SVM), Random Forest
(RF) and Naïve Bayes (NB). The main features and
methodology of each of the 3 ML techniques was
represented. Performance comparison of the
investigated techniques [6] has been applied using the
original Wisconsin breast cancerdata set.Simulation
results obtained has proved that classification
performance varies based on the strategy that is selected.
Results show that SVM have the highest performance in
terms of accuracy, specificity and precision. But, RF
has the highest probability of properly classifying tumor.
REFERENCES
1. World Health Organization - Breast Cancer:
Prevention and Control, 20 Jan 2015. http://
www.who.int/cancer/detection/breastcancer/en/i
nde x1.html.
2. “Breast cancerpresentation delaysamongAraband
national women in the UAE, a qualitative study,” Y.
Elobaid, T.-C. Aw, J. N. W. Lim, S. Hamid in Mar.
2016.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1284
3. Bayesian Node Analysis, S. Conrady and L. Jouffe,
2013.
4. Weka 3.5.6, an open source data mining software
tool developed at university of Waikato, New
Zealand, http://www.cs.waikato.ac.nz/ml/weka/
2009.
5. "Multisurface method of pattern separation for
medical diagnosis applied to breast cytology”,
Wolberg, William H., and Olvi L. Mangasarian.
6. "Analysis of feature selection algorithms on
classification: a survey", Vanaja, S., and K. Ramesh
Kumar. International Journal of Computer
Applications (2014).

Weitere ähnliche Inhalte

Was ist angesagt?

SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
Iganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer ThreatsIganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer Threats
ijsrd.com
 
A chi-square-SVM based pedagogical rule extraction method for microarray data...
A chi-square-SVM based pedagogical rule extraction method for microarray data...A chi-square-SVM based pedagogical rule extraction method for microarray data...
A chi-square-SVM based pedagogical rule extraction method for microarray data...
IJAAS Team
 
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Devansh16
 

Was ist angesagt? (20)

SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
 
Iganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer ThreatsIganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer Threats
 
Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...
Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...
Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...
 
IRJET- Breast Cancer Disease Prediction : Using Machine Learning Approach
IRJET- Breast Cancer Disease Prediction : Using Machine Learning ApproachIRJET- Breast Cancer Disease Prediction : Using Machine Learning Approach
IRJET- Breast Cancer Disease Prediction : Using Machine Learning Approach
 
Hybrid prediction model with missing value imputation for medical data 2015-g...
Hybrid prediction model with missing value imputation for medical data 2015-g...Hybrid prediction model with missing value imputation for medical data 2015-g...
Hybrid prediction model with missing value imputation for medical data 2015-g...
 
Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...
 
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...IRJET -  	  A Survey on Machine Learning Intelligence Techniques for Medical ...
IRJET - A Survey on Machine Learning Intelligence Techniques for Medical ...
 
Breast Cancer
Breast CancerBreast Cancer
Breast Cancer
 
A chi-square-SVM based pedagogical rule extraction method for microarray data...
A chi-square-SVM based pedagogical rule extraction method for microarray data...A chi-square-SVM based pedagogical rule extraction method for microarray data...
A chi-square-SVM based pedagogical rule extraction method for microarray data...
 
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
 
Disease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple TreeDisease Identification and Detection in Apple Tree
Disease Identification and Detection in Apple Tree
 
IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...
IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...
IRJET- A New Hybrid Squirrel Search Algorithm and Invasive Weed Optimization ...
 
Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...
 
Disease prediction using machine learning
Disease prediction using machine learningDisease prediction using machine learning
Disease prediction using machine learning
 
xtremes
xtremesxtremes
xtremes
 
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
MLTDD : USE OF MACHINE LEARNING TECHNIQUES FOR DIAGNOSIS OF THYROID GLAND DIS...
 
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
 
The XNAT imaging informatics platform
The XNAT imaging informatics platformThe XNAT imaging informatics platform
The XNAT imaging informatics platform
 
IRJET - A Review on Identification and Disease Detection in Plants using Mach...
IRJET - A Review on Identification and Disease Detection in Plants using Mach...IRJET - A Review on Identification and Disease Detection in Plants using Mach...
IRJET - A Review on Identification and Disease Detection in Plants using Mach...
 

Ähnlich wie IRJET - Survey on Analysis of Breast Cancer Prediction

SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
ijscai
 

Ähnlich wie IRJET - Survey on Analysis of Breast Cancer Prediction (20)

Breast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine LearningBreast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine Learning
 
IRJET- Breast Cancer Prediction using Supervised Machine Learning Algorithms
IRJET- Breast Cancer Prediction using Supervised Machine Learning AlgorithmsIRJET- Breast Cancer Prediction using Supervised Machine Learning Algorithms
IRJET- Breast Cancer Prediction using Supervised Machine Learning Algorithms
 
A Comprehensive Survey On Predictive Analysis Of Breast Cancer
A Comprehensive Survey On Predictive Analysis Of Breast CancerA Comprehensive Survey On Predictive Analysis Of Breast Cancer
A Comprehensive Survey On Predictive Analysis Of Breast Cancer
 
Health Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep Learning
 
Breast Cancer Prediction
Breast Cancer PredictionBreast Cancer Prediction
Breast Cancer Prediction
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
 
IRJET- Breast Cancer Prediction using Deep Learning
IRJET-  	  Breast Cancer Prediction using Deep LearningIRJET-  	  Breast Cancer Prediction using Deep Learning
IRJET- Breast Cancer Prediction using Deep Learning
 
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET-  	  Breast Cancer Relapse Prognosis by Classic and Modern Structures o...IRJET-  	  Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
 
Classification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer DataClassification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer Data
 
IRJET - Lung Disease Prediction using Image Processing and CNN Algorithm
IRJET -  	  Lung Disease Prediction using Image Processing and CNN AlgorithmIRJET -  	  Lung Disease Prediction using Image Processing and CNN Algorithm
IRJET - Lung Disease Prediction using Image Processing and CNN Algorithm
 
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNINGA SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
 
IRJET - Classification of Cancer Images using Deep Learning
IRJET -  	  Classification of Cancer Images using Deep LearningIRJET -  	  Classification of Cancer Images using Deep Learning
IRJET - Classification of Cancer Images using Deep Learning
 
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSISSEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
 
Skin Lesion Classification using Supervised Algorithm in Data Mining
Skin Lesion Classification using Supervised Algorithm in Data MiningSkin Lesion Classification using Supervised Algorithm in Data Mining
Skin Lesion Classification using Supervised Algorithm in Data Mining
 
Analysis of Machine Learning Techniques for Breast Cancer Prediction
Analysis of Machine Learning Techniques for Breast Cancer PredictionAnalysis of Machine Learning Techniques for Breast Cancer Prediction
Analysis of Machine Learning Techniques for Breast Cancer Prediction
 
Machine Learning Aided Breast Cancer Classification
Machine Learning Aided Breast Cancer ClassificationMachine Learning Aided Breast Cancer Classification
Machine Learning Aided Breast Cancer Classification
 

Mehr von IRJET Journal

Mehr von IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Kürzlich hochgeladen

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Kürzlich hochgeladen (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 

IRJET - Survey on Analysis of Breast Cancer Prediction

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1281 SURVEY ON ANALYSIS OF BREAST CANCER PREDICTION Lingaraj N1, Krishna Kumar S1, Banu Priya K1, R.M. Shiny2 1UG Student, Department of Computer Science and Engineering, Agni College of Technology, Tamil Nadu, India 2Assistant Professor, Department of Computer Science and Engineering, Agni College of Technology, Tamil Nadu, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Breast cancer is one of the major threats inmiddle aged women throughout the world. In today’s world, this is the second most threatening cause of death in women. Early diagnosis can significantly reduce the chances of death. But it is not an easy due to several uncertainties in detection. Machine Learning techniques areusedtodevelop atoolwhich is used as an effective mechanism for early detection and diagnosis of breast cancer. Early diagnosis is helpful for physicians which will greatly enhance the survival rate of cancer patients. This paper compares three of the most popular ML techniques which are used for breast cancer detection and diagnosis. The techniques are Support Vector Machine (SVM), Random Forest (RF) and Naive Bayes (NB). The probability will be calculated for each of thesetechniques and the algorithm with highest probability provides themore accurate results. Key Words: BreastCancer, MachineLearning, Detection, Diagnosis, Prediction, Analysis. 1. INTRODUCTION Cancer is a heterogeneous disease that may be divided into many types. Accordingto WorldHealthOrganization[1], twenty five percent of the females are diagnosed with breasts cancer at some stage in their life. In UAE[2], forty third of feminine cancer patients are diagnosed with breast cancer. Accurately predicting a cancerous growth remainsa difficult task for several physicians. The emergence of latest medical technologies and therefore the monumental quantity of patient information have impelled the trail for the event of latest methods within the prediction and detection of cancer. Recurrent breast cancer is the one which comes back in the same or opposite breast or chest wall after a time period when the cancer couldn't be detected. Though information assessment that is collected from the patient and a physicians intake greatly contributes to the diagnostic method, supportive tools could be superimposed to assist facilitate proper diagnoses. These tools aim to eliminate possible diagnostic errors and supply a quick method for analyzing the large chunks of Data. 1.1 Machine Learning: Machine Learning (ML), isa subfieldof Artificial Intelligence (AI) that permits machines to learn without explicit programming by exposing them to sets of information permitting them to learn a selected task through expertise. Over the previous few decades, machine learning strategies have been widespread within the development of predictive models so as to support effective decision-making. In cancer analysis, these techniques could be used to determine completely different patterns during a information set and consequently predict whether or not a cancer is malignant or benign. The performance of such techniques will be evaluated supported theaccuracyoftheclassification,recall, precision, and therefore the space underneath ROC. 1.2 Data Preprocessing Recently, data processing has become a well-liked economical tool for data discovery and extracting hidden patterns from massive datasets. It involves theemployment of refined knowledge manipulationtools togetantecedently unknown, valid patterns and relationships in massive dataset. We tend to apply 3 robust data processing classificationalgorithms i.e. SVM,Randomforest and Naive Bayes, a medium sized knowledge set that contained thirty five attributes and 198 cancer patient data. 2. MACHINE LEARNING TECHNIQUES The learning method in ML techniques are often divided into 2 main categories, supervised and unsupervised learning. In supervised learning, a set of data instances are used to train the machine and are labeled to grant the right result. However, in unsupervised learning, there are no pre-determined knowledge sets and no notion of the expected outcome, whichsuggests thatthegoal istougherto achieve. 2.1. Support Vector Machine: Support Vector Machine is one of the supervised machine learning classification techniques that is widelyappliedin the field of cancer identification and prognosis. SVM is functioned by choosing the critical samples from all categories. These samples are called support vectors.These classes are separated by generating a linear function that divides them broadly as possible using these support vectors.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1282 Fig -1: SVM generated hyper-planes 2.2. Random Forest: RF brings along many decision trees to ensemble a forestof trees. The RF methodology relies on an algorithmic approach within which each iteration involves choosing one random sample of size N from the data set with replacement, and another random sample from the predictors without replacement. Then thedataobtained is partitioned. The other data is then dropped. These steps are repeated certain times depending on how many trees are needed. Finally, a count is made over the trees that classify the observation in one category. Cases are then classified based on a majority vote over the decision tree. Fig -2: Working of Random Forest 2.3. Naive Bayes: Naive Bayes is a subfield of probabilistic graphical models which is used for prediction [3] and representation of knowledge in uncertain domains. Naïve Bayes corresponds to a widely used structurein machinelearningknownas the directed acyclic graph (DAG). This graph consists of many nodes, each equivalent to a variable and the node edges represents direct dependence among the corresponding nodes in the graph. Fig -3: Breast cancer attributes – DAG Model 3. SIMULATION SETUP: 3.1. Data Set: Our investigation is based on the initial Wisconsin breast cancer data set that's obtained from the UCI Machine Learning Repository [5] , an online open source repository. This data set was collected over 3 years from the University of Wisconsin Hospitals by Dr. William H. Wolberg and it consists of 669 instances, where the classification results are either malignant or benign. There were 458 benign cases and 241 was malignant. The ten attributes used are: • Clump Thickness • Cell Size Uniformity • Cell shape Uniformity • Marginal Adhesion • Single epithelial cell Size• Bare Nuclei • Bland chromatin • normal Nuclei • Mitoses • class. 3.2. Training set: The classifiers are tested using the k – fold cross validation methodology. This validation technique can randomly separate the training set into k subsets where one of the k-1 subsets are used for testing and the rest for training. 10-fold cross-validation is the preferred k value utilized in most validation in ML and will be used in this paper. This implies nine subsets will be used for training of the classifier and the remaining one for the testing. This system is used to avoid over fitting of the training set, which is likely to occur in small data sets and large number of attributes. Fig -3: 10- k cross validation method 3.3. Simulation software: Here, WEKA - Waikato Environment for Knowledge Analysis [4] software is used as a Machine Learning tool. WEKA is a Java based open supply tool that was initially released to the public in 2006 undertheGNU General Public License. This tool provides many ML techniques and algorithms as well as the classification techniques that are being investigated in this paper. Alternative features include data preprocessing, clustering, feature selection evaluation and rule discovery algorithms. Datasets are accepted in many formats, like CSV and ARFF. Besides
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1283 being an open source tool, WEKA is additionally attractive due to its portability and ease of use GUI. 4. RESULT: This section describes the parameters and presents the results that assist the 3 classifiers that are being investigated in this paper. 4.1. Accuracy: The classifier accuracy is a measure of how well the classifier will properly predict cases into their correct category. It is the number of correct predictions divided by the total number of instances within the data set. It is worth noting that the accuracy is highly dependent on the threshold chosen by the classifier and may therefore change for various testing sets. Hence, accuracy may be calculated using the subsequent equation: 4.2. Recall: Recall, also commonly referredtoas sensitivity, isthe rateof the positive observations that are correctly predicted as positive. This measure is desirable, particularly in the medical field because how many of the observations are properly diagnosed. In this study, it is more important to properly identify a malignant tumor than it is to incorrectly identify a benign one. Table-1: Recall values comparison 4.3. Precision: Precision, also commonly called confidence, is the rate of both true positives and true negatives that are identified as true positives. This shows how well the classifierhandles the positive observations but doesn't say much regarding the negative ones. The precision valuesfor all 3 techniques are shown in Table-2: Table-2: Precision values comparison 4.4. ROC area: A receiver operating characteristics (ROC) graph is a way to visualize a classifiers performance by showing thetrade- off between the price and advantage of that classifier. ROC is one amongst the foremost common and useful performance measure fordata processingtechniques. Percentage values for the roc area of all 3 techniques are shown in Table-3: Table-3: ROC values 5. CONCLUSION ML techniques are widely used in the medical field and have served as a useful diagnostic tool that helps physicians in analyzing the available data as well as designing medical expert systems. This paper saysaboutthreeof the most common ML techniques which are normally used for breast cancer detection and diagnosis. They are Support Vector Machine (SVM), Random Forest (RF) and Naïve Bayes (NB). The main features and methodology of each of the 3 ML techniques was represented. Performance comparison of the investigated techniques [6] has been applied using the original Wisconsin breast cancerdata set.Simulation results obtained has proved that classification performance varies based on the strategy that is selected. Results show that SVM have the highest performance in terms of accuracy, specificity and precision. But, RF has the highest probability of properly classifying tumor. REFERENCES 1. World Health Organization - Breast Cancer: Prevention and Control, 20 Jan 2015. http:// www.who.int/cancer/detection/breastcancer/en/i nde x1.html. 2. “Breast cancerpresentation delaysamongAraband national women in the UAE, a qualitative study,” Y. Elobaid, T.-C. Aw, J. N. W. Lim, S. Hamid in Mar. 2016.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1284 3. Bayesian Node Analysis, S. Conrady and L. Jouffe, 2013. 4. Weka 3.5.6, an open source data mining software tool developed at university of Waikato, New Zealand, http://www.cs.waikato.ac.nz/ml/weka/ 2009. 5. "Multisurface method of pattern separation for medical diagnosis applied to breast cytology”, Wolberg, William H., and Olvi L. Mangasarian. 6. "Analysis of feature selection algorithms on classification: a survey", Vanaja, S., and K. Ramesh Kumar. International Journal of Computer Applications (2014).