Classification ANN

Classification Tools &
Artificial Neural Network
Vinaytosh Mishra
B-Tech (ECE) ,IIT(BHU)
MBA,IMNU,Ahmedabad
PG Diploma in Statistics & Computing ,
Institute of Science ,BHU
Specialization in Digital Marketing ,
University of Illinois ,Urbana Champaign ,USA

Agenda
 Introduction of Machine Learning
 Type of Machine Learning
 Type of Classification Tools
 Naïve Bayesian
 Logistic Regression
 Artificial Neural Networks
 Comparison of three networks
 Results
 Conclusions

A Few Quotes
 “A breakthrough in machine learning would be worth
ten Microsofts” (Bill Gates, Chairman, Microsoft)
 “Machine learning is the next Internet”
(Tony Tether, Director, DARPA)
 Machine learning is the hot new thing”
(John Hennessy, President, Stanford)
 “Web rankings today are mostly a matter of
machine learning” (Prabhakar Raghavan, Dir.
Research, Yahoo)
 “Machine learning is going to result in a real
revolution” (Greg Papadopoulos, CTO, Sun)

So What Is Machine Learning?
 Automating automation
 Getting computers to program
themselves
 Writing software is the bottleneck
 Let the data do the work instead!

Traditional Programming
Machine Learning
Computer
Data
Program
Output
Computer
Data
Output
Program

Types of Learning
 Supervised (inductive) learning
 Training data includes desired outputs
 Unsupervised learning
 Training data does not include desired outputs
 Semi-supervised learning
 Training data includes a few desired outputs
 Reinforcement learning
 Rewards from sequence of actions

Inductive Learning
 Given examples of a function (X, F(X))
 Predict function F(X) for new examples X
 Discrete F(X): Classification
 Continuous F(X): Regression
 F(X) = Probability(X): Probability estimation

Decision Boundary
What is good decision boundary ?

Naïve Bayesian
 The Naive Bayesian classifier is based on
Bayes theorem with independence
assumptions between predictors.
 A Naive Bayesian model is easy to build,
with no complicated iterative parameter
estimation which makes it particularly
useful for very large datasets.
 Despite its simplicity, the Naive Bayesian
classifier often does surprisingly well and is
widely used because it often outperforms
more sophisticated classification methods.

How it works?
If this value is greater than certain
probability value the combination will
be selected in that class.

Logistic Regression
 In supervised learning logistic is a regression model
where the dependent variable (DV) is categorical.
Logistic regression is used widely in many fields,
including the medical and social sciences
 Many risk prediction models based on Logistic
Regression, have been developed to predict whether a
patient has a given disease like diabetes, coronary
heart disease, based on observed characteristics of the
patient like age, sex, body mass index, results of
various blood tests and anthropometric tests.

Neural Networks
 Used for:
 Classification
 Noise reduction
 Prediction
 Great because:
 Able to learn
 Able to generalize

Neural Networks: Biological Basis

Feed-forward Neural Network
Perceptron:
Hidden layer

Neural Networks: Training
 Presenting the network with sample data and
modifying the weights to better approximate
the desired function.
 Supervised Learning
 Supply network with inputs and desired outputs
 Initially, the weights are randomly set
 Weights modified to reduce difference between
actual and desired outputs
 Back propagation

Comparison of three methods..
Dataset Description No of Attributes No of Instances
Pima Indians Diabetes Database of National Institute
of Diabetes and Digestive and Kidney Diseases
8 768
S/
N
Attribute Description
1 Number of times pregnant NPG
2 Plasma glucose concentration PGL
3 Diastolic blood pressure (mm Hg) DIA
4 Triceps skin fold thickness (mm) TSF
5 2-Hour serum insulin INS
6 Body mass index (kg/m2) BMI
7 Diabetes pedigree function DPF
8 Age (years) AGE
9 Class CLASS
Name of Method Bayesian Naïve Logistic Regression ANN (8-6-2)
Accuracy of Prediction 76.3% 78.3% 79.7%

Result & Conclusion
 As the results suggest the Artificial Neural Network based
prediction model are better than traditional models like
Bayesian Naïve and Logistic Regression. We were not able
to witness a great difference in reported accuracy, among
the models discussed. But with increasing number of
variables, we may observe ANN as distant winner
 The advancement in data base management technologies
has enabled us to practice evidence based medicine. The
technologies like cloud computing and Hadoop has made it
easy to manage and share the data. The advance
classification tools are more accurate and can be applied
on larger database to classify the disease more
accurately.

References
 Ramachandran, Socio-Economic Burden of Diabetes in India, JULY 2007 VOL. 55
 Anil Bhansali, Cost of Diabetes Care : Prevent Diabetes or Face Catastrophe, JAPI • FEBRUARY 2013
,VOL. 61
 International Diabetes Federation. IDF Diabetes Atlas, 5th edn. Brussels: International Diabetes
Federation, 2011.
 Li G, Zhang P, Wang J, et al. The long-term effect of lifestyle interventions to prevent diabetes in the
China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet 2008; 371: 1783– 89.
 Noble D, Mathur R, Dent T, Meads C, Greenhalgh T. Risk models and scores for type 2 diabetes:
systematic review. BMJ 2011; 343: d7163
 Buijsse B, Simmons RK, Griffin SJ, Schulze MB. Risk assessment tools for identifying individuals at risk of
developing type 2 diabetes. Epidemiology Rev 2011; 33: 46–62.
 Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk.
Diabetes Care 2003; 26: 725–31.
 Collins G S, Mallett S, Omar O, Yu LM. Developing risk prediction models for type 2 diabetes: a
systematic review of methodology and reporting. BMC Med 2011; 9: 103.
 Diabetes Care January 2002 vol. 25 no. suppl 1 s21-s24
 David A. Freedman (2009). Statistical Models: Theory and Practice. Cambridge University Press. p. 128.
 Boyd, C. R.; Tolson, M. A.; Copes, W. S. (1987). "Evaluating trauma care: The TRISS method. Trauma
Score and the Injury Severity Score". The Journal of trauma 27 (4): 370–378
 Truett, J; Cornfield, J; Kannel, W (1967). "A multivariate analysis of the risk of coronary heart disease in
Framingham". Journal of chronic diseases 20 (7): 511–24.
 David A. Freedman (2009). Statistical Models: Theory and Practice. Cambridge University Press. p. 128.
 I. Rish, An empirical study of the naive Bayes classifier, http://www.cc.gatech.edu
 Model Extremely Complex Functions, Neural Networks, Model Extremely Complex Functions, Neural
Networks (2015)
 Patterson, D. W., Artificial Neural Networks: theory and Application, Prentice Hall, pp141-243, 1996

Classification ANN

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Classification ANN

Similar to Classification ANN (20)

Recently uploaded

Recently uploaded (20)

Classification ANN