Week 11: Programming for Data Analysis

•

0 gefällt mir•182 views

Ferdin Joe John Joseph PhD

Binary Classification Naive Bayes Classifier Support Vector Machine

Daten & Analysen

Programming for Data
Analysis
Week 11
Dr. Ferdin Joe John Joseph
Faculty of Information Technology
Thai – Nichi Institute of Technology, Bangkok

Today’s lesson
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
2
• Binary Classification
• Naïve Bayes Classifier
• Support Vector Machine

Naïve Bayes Classifier
• Conditional Probability Model of Classification
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
3

Conditional Probability Model of Classification
• The conditional probability can be calculated using the joint
probability, although it would be intractable.
• Bayes Theorem provides a principled way for calculating the
conditional probability.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
4

Bayes Theorem
• P(A|B) = P(B|A) * P(A) / P(B)
• We can frame classification as a conditional classification problem
with Bayes Theorem as follows:
P(yi | x1, x2, …, xn) = P(x1, x2, …, xn | yi) * P(yi) / P(x1, x2, …, xn)
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
5

Naïve Bayes
• For simplifying the calculation
• The Bayes Theorem assumes that each input variable is dependent
upon all other variables.
• This is a cause of complexity in the calculation.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
6

Calculation of Prior and Conditional
Probabilities
• P(yi) = examples with yi / total examples
• In the case of categorical variables, such as counts or labels, a
multinomial distribution can be used.
• If the variables are binary, such as yes/no or true/false, a binomial
distribution can be used.
• If a variable is numerical, such as a measurement, often a Gaussian
distribution is used.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
7

Naïve Bayes Distribution
• Binomial Naïve Bayes: Naïve Bayes that uses a binomial distribution.
• Multinomial Naïve Bayes: Naïve Bayes that uses a multinomial
distribution.
• Gaussian Naïve Bayes: Naïve Bayes that uses a Gaussian distribution.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
8

Preparation of dataset
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
9

Fitting Probability Distribution
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
10

Sorting Data
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
11

Calculate Priors
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
12

Fit Distribution
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
13

Summary
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
14

Independent Conditional Probability
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
15

Taking a sample
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
16

Probability Scores
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
17

Wrap Up
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
18

Gaussian Naïve Bayes
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
19

Lab Exercise
• Use this source code and make a classification report which gives
accuracy, precision, recall and F1-score
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
20

Support Vector Machine
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
21

SVM Objective
• A training set, S, for an SVM is comprised of m samples.
• The features, x, consist of real numbers and the classifications, y,
must be -1 or 1.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
22

SVM Hyperplane
• The SVM hyperlane is defined by the weight vector, w, and the bias, b,
and is defined as:
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
23

Example for 2 Feature
• Hyperplane for two features can be written as:
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
24

Libraries
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
25

Data Clusters
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
26

Data Clusters
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
27

Prepare Datasets
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
28

Learning Rate
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
29

Learning Rate
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
30

Draw Decision Boundary
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
31

Plot Data
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
32

Plot Data
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
33

Test Classifier
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
34

Test Classifier
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
35

Output
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
36

Lab Exercise
• Create Confusion Matrix and classification report for SVM
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
37

DSA 207 – Binary Classifier
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
38

Weitere ähnliche Inhalte

Was ist angesagt?

Blockchain Technology - Week 9 - BlockciphersFerdin Joe John Joseph PhD

Week2: Programming for Data AnalysisFerdin Joe John Joseph PhD

Programming for Data Analysis: Week 4Ferdin Joe John Joseph PhD

Programming for Data Analysis: Week 3Ferdin Joe John Joseph PhD

Data Wrangling Week 4Ferdin Joe John Joseph PhD

Blockchain Technology - Week 4 - Hyperledger and Smart ContractsFerdin Joe John Joseph PhD

Blockchain Technology - Week 10 - CAP Teorem, Byzantines General ProblemFerdin Joe John Joseph PhD

Blockchain Technology - Week 3 - FinTech and CryptocurrenciesFerdin Joe John Joseph PhD

Data wrangling week3Ferdin Joe John Joseph PhD

Data wrangling week 6Ferdin Joe John Joseph PhD

Blockchain Technology - Week 1 - Introduction to BlockchainFerdin Joe John Joseph PhD

Data wrangling week 10Ferdin Joe John Joseph PhD

Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Ferdin Joe John Joseph PhD

Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Ferdin Joe John Joseph PhD

Blockchain in ecommercezaarahary

Week 12: Cloud AI- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Spelling correction systems for e-commerce platformsAnjan Goswami

Was ist angesagt? (20)

Blockchain Technology - Week 9 - Blockciphers

Week2: Programming for Data Analysis

Programming for Data Analysis: Week 4

Programming for Data Analysis: Week 3

Data Wrangling Week 4

Blockchain Technology - Week 4 - Hyperledger and Smart Contracts

Blockchain Technology - Week 10 - CAP Teorem, Byzantines General Problem

Blockchain Technology - Week 3 - FinTech and Cryptocurrencies

Data wrangling week3

Data wrangling week 6

Blockchain Technology - Week 1 - Introduction to Blockchain

Data wrangling week 10

Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing

Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...

Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...

Blockchain in ecommerce

Week 12: Cloud AI- DSA 441 Cloud Computing

Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing

Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing

Spelling correction systems for e-commerce platforms

Ähnlich wie Week 11: Programming for Data Analysis

Data Wrangling Week 7Ferdin Joe John Joseph PhD

Unit 5 QuantizationDr Piyush Charan

Prototype System for Recommending Academic Subjects for Students' Self Design...siramatu-lab

Data wrangling week 5Ferdin Joe John Joseph PhD

Deep Learning and CNN ArchitecturesFerdin Joe John Joseph PhD

2019 DSA 105 Introduction to Data Science Week 3Ferdin Joe John Joseph PhD

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal

IRJET- Student Placement Prediction using Machine LearningIRJET Journal

Smart SE: Recurrent Education Program of IoT and AI for BusinessHironori Washizaki

Application of a Novel Subject Classification Scheme for a Bibliographic Data...National Institute of Informatics

Educational Data Mining to Analyze Students Performance – Concept PlanIRJET Journal

cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...Mayurkumarpatil1

IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...IRJET Journal

Intelligent Career Guidance System.pptxAnonymous366406

Lec 1 integrating data science and data analytics in various research thrustMenchita Falcutila Dumlao

2015 03-28-eb-finalChristopher Wilson

Btsdsb2018Cisco Systems

BayesiaLab_Book_V18 (1)Bayesia USA

Sunschool2014 germany 1Михалкович Станислав

Data Structures and Algorithm - Week 4 - Trees, Binary TreesFerdin Joe John Joseph PhD

Ähnlich wie Week 11: Programming for Data Analysis (20)

Data Wrangling Week 7

Unit 5 Quantization

Prototype System for Recommending Academic Subjects for Students' Self Design...

Data wrangling week 5

Deep Learning and CNN Architectures

2019 DSA 105 Introduction to Data Science Week 3

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime

IRJET- Student Placement Prediction using Machine Learning

Smart SE: Recurrent Education Program of IoT and AI for Business

Application of a Novel Subject Classification Scheme for a Bibliographic Data...

Educational Data Mining to Analyze Students Performance – Concept Plan

cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...

IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...

Intelligent Career Guidance System.pptx

Lec 1 integrating data science and data analytics in various research thrust

2015 03-28-eb-final

Btsdsb2018

BayesiaLab_Book_V18 (1)

Sunschool2014 germany 1

Data Structures and Algorithm - Week 4 - Trees, Binary Trees

Mehr von Ferdin Joe John Joseph PhD

Invited Talk DGTiCon 2022Ferdin Joe John Joseph PhD

Week 11: Cloud Native- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 10: Cloud Security- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Ferdin Joe John Joseph PhD

Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD

Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumFerdin Joe John Joseph PhD

Hadoop in Alibaba CloudFerdin Joe John Joseph PhD

Cloud Computing Essentials in Alibaba CloudFerdin Joe John Joseph PhD

Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD

Deep learning - IntroductionFerdin Joe John Joseph PhD

Data wrangling week 11Ferdin Joe John Joseph PhD

Data wrangling week 9Ferdin Joe John Joseph PhD

Mehr von Ferdin Joe John Joseph PhD (13)

Invited Talk DGTiCon 2022

Week 11: Cloud Native- DSA 441 Cloud Computing

Week 10: Cloud Security- DSA 441 Cloud Computing

Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...

Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing

Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing

Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum

Hadoop in Alibaba Cloud

Cloud Computing Essentials in Alibaba Cloud

Transforming deep into transformers – a computer vision approach

Deep learning - Introduction

Data wrangling week 11

Data wrangling week 9

Kürzlich hochgeladen

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila

Capstone Project on IBM Data Analytics ProgramMoniSankarHazra

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01

Discover Why Less is More in B2B Researchmichael115558

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Kürzlich hochgeladen (20)

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Capstone Project on IBM Data Analytics Program

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

Discover Why Less is More in B2B Research

BigBuy dropshipping via API with DroFx.pptx

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

Week 11: Programming for Data Analysis

1. Programming for Data Analysis Week 11 Dr. Ferdin Joe John Joseph Faculty of Information Technology Thai – Nichi Institute of Technology, Bangkok

2. Today’s lesson Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 2 • Binary Classification • Naïve Bayes Classifier • Support Vector Machine

3. Naïve Bayes Classifier • Conditional Probability Model of Classification Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 3

4. Conditional Probability Model of Classification • The conditional probability can be calculated using the joint probability, although it would be intractable. • Bayes Theorem provides a principled way for calculating the conditional probability. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 4

5. Bayes Theorem • P(A|B) = P(B|A) * P(A) / P(B) • We can frame classification as a conditional classification problem with Bayes Theorem as follows: P(yi | x1, x2, …, xn) = P(x1, x2, …, xn | yi) * P(yi) / P(x1, x2, …, xn) Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 5

6. Naïve Bayes • For simplifying the calculation • The Bayes Theorem assumes that each input variable is dependent upon all other variables. • This is a cause of complexity in the calculation. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 6

7. Calculation of Prior and Conditional Probabilities • P(yi) = examples with yi / total examples • In the case of categorical variables, such as counts or labels, a multinomial distribution can be used. • If the variables are binary, such as yes/no or true/false, a binomial distribution can be used. • If a variable is numerical, such as a measurement, often a Gaussian distribution is used. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 7

8. Naïve Bayes Distribution • Binomial Naïve Bayes: Naïve Bayes that uses a binomial distribution. • Multinomial Naïve Bayes: Naïve Bayes that uses a multinomial distribution. • Gaussian Naïve Bayes: Naïve Bayes that uses a Gaussian distribution. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 8

9. Preparation of dataset Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 9

10. Fitting Probability Distribution Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 10

11. Sorting Data Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 11

12. Calculate Priors Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 12

13. Fit Distribution Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 13

14. Summary Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 14

15. Independent Conditional Probability Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 15

16. Taking a sample Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 16

17. Probability Scores Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 17

18. Wrap Up Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 18

19. Gaussian Naïve Bayes Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 19

20. Lab Exercise • Use this source code and make a classification report which gives accuracy, precision, recall and F1-score Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 20

21. Support Vector Machine Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 21

22. SVM Objective • A training set, S, for an SVM is comprised of m samples. • The features, x, consist of real numbers and the classifications, y, must be -1 or 1. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 22

23. SVM Hyperplane • The SVM hyperlane is defined by the weight vector, w, and the bias, b, and is defined as: Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 23

24. Example for 2 Feature • Hyperplane for two features can be written as: Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 24

25. Libraries Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 25

26. Data Clusters Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 26

27. Data Clusters Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 27

28. Prepare Datasets Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 28

29. Learning Rate Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 29

30. Learning Rate Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 30

31. Draw Decision Boundary Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 31

32. Plot Data Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 32

33. Plot Data Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 33

34. Test Classifier Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 34

35. Test Classifier Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 35

36. Output Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 36

37. Lab Exercise • Create Confusion Matrix and classification report for SVM Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 37

38. DSA 207 – Binary Classifier Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 38

Week 11: Programming for Data Analysis

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Week 11: Programming for Data Analysis

Ähnlich wie Week 11: Programming for Data Analysis (20)

Mehr von Ferdin Joe John Joseph PhD

Mehr von Ferdin Joe John Joseph PhD (13)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Week 11: Programming for Data Analysis