Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

# 2 Receiver Operating Characteristic (ROC)

The second lecture from the Machine Learning course series of lectures. This lecture discusses ROC metric for evaluating machine learning model's performance. In particular, two ways of building ROC are discussed. A link to my github (https://github.com/skyfallen/MachineLearningPracticals) with practicals that I have designed for this course in both R and Python. I can share keynote files, contact me via e-mail: dmytro.fishman@ut.ee.

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Als Erste(r) kommentieren

### 2 Receiver Operating Characteristic (ROC)

1. 1. Introduction to Machine Learning (ROC) Dmytro Fishman (dmytro@ut.ee)
2. 2. ROCK
3. 3. ROCK
5. 5. (1, 0, 1, 0 ,1) True labels
6. 6. (1, 0, 1, 0 ,1) True labels (0.6,0.2,0.7,0.5,0.4) Classiﬁer predicts
7. 7. (1, 1, 0, 1, 0) True labels (0.7,0.6,0.5,0.4,0.2)
8. 8. (1, 1, 0, 1, 0) True labels (0.7,0.6,0.5,0.4,0.2) There are as many marks on y-axis as there are 1’s in our true labels
9. 9. (1, 1, 0, 1, 0) True labels There are as many marks on x-axis as there are 0’s in our true labels (0.7,0.6,0.5,0.4,0.2)
10. 10. (1, 1, 0, 1, 0) True labels Go through true labels one by one, if 1 go up, if 0 go right (0.7,0.6,0.5,0.4,0.2)
11. 11. (1, 1, 0, 1, 0) True labels (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
12. 12. (1, 1, 0, 1, 0) True labels 1 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
13. 13. (1, 1, 0, 1, 0) True labels (0.7,0.6,0.5,0.4,0.2) 1 Go through true labels one by one, if 1 go up, if 0 go right
14. 14. (1, 1, 0, 1, 0) True labels 1 1 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
15. 15. (1, 1, 0, 1, 0) True labels 1 1 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
16. 16. (1, 1, 0, 1, 0) True labels 1 1 0 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
17. 17. (1, 1, 0, 1, 0) True labels 1 1 0 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
18. 18. (1, 1, 0, 1, 0) True labels 1 1 0 1 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
19. 19. (1, 1, 0, 1, 0) True labels 1 1 0 1 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
20. 20. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 (0.7,0.6,0.5,0.4,0.2) Go through true labels one by one, if 1 go up, if 0 go right
21. 21. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 This is called Receiver Operating Characteristic (ROC) (0.7,0.6,0.5,0.4,0.2)
22. 22. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 This is square has sides of length 1 and 1 (0.7,0.6,0.5,0.4,0.2)
23. 23. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 We need to ﬁnd a square of the area under the (ROC) curve (0.7,0.6,0.5,0.4,0.2)
24. 24. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 We need to ﬁnd a square of the area under the (ROC) curveAUC = 0.83 (0.7,0.6,0.5,0.4,0.2)
25. 25. (1, 1, 0, 1, 0) True labels 1 1 0 1 0 Here is another way to do it (not always you can count labels yourself)AUC = 0.83 (0.7,0.6,0.5,0.4,0.2)
26. 26. (1, 1, 0, 1, 0) True labelsTPR FPR (0.7,0.6,0.5,0.4,0.2)
27. 27. (1, 1, 0, 1, 0) True labelsTPR FPR TPR = TP/P (0.7,0.6,0.5,0.4,0.2)
28. 28. (1, 1, 0, 1, 0) True labelsTPR FPR FPR = FP/N TPR = TP/P (0.7,0.6,0.5,0.4,0.2)
29. 29. (1, 1, 0, 1, 0) True labelsTPR FPR FPR = FP/(FP + TN) TPR = TP/P (0.7,0.6,0.5,0.4,0.2)
30. 30. (1, 1, 0, 1, 0) True labels (0.7,0.6,0.5,0.4,0.2) TPR = TP/P FPR = FP/(FP + TN) TPR FPR
31. 31. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR We would like to evaluate different strictness levels of our classiﬁer (0.7,0.6,0.5,0.4,0.2)
32. 32. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR What if consider as positive (1) only instances that were predicted positive with >= 0.7 probability? (0.7,0.6,0.5,0.4,0.2)
33. 33. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR What if consider as positive (1) only instances that were predicted positive with >= 0.7 probability? (0.7,0.6,0.5,0.4,0.2)
34. 34. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR What if consider as positive (1) only instances that were predicted positive with >= 0.7 probability? (0.7,0.6,0.5,0.4,0.2) What would TPR and FPR be in this case?
35. 35. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR What if consider as positive (1) only instances that were predicted positive with >= 0.7 probability? (0.7,0.6,0.5,0.4,0.2) What would TPR and FPR be in this case? >= 0.7 TPR = ? FPR = ?
36. 36. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR What if consider as positive (1) only instances that were predicted positive with >= 0.7 probability? (0.7,0.6,0.5,0.4,0.2) What would TPR and FPR be in this case? >= 0.7 TPR = 1/3 FPR = 0/(0 + 2)
37. 37. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0
38. 38. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 Let’s plot this point on a graph 1/3 0
39. 39. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 We shall do this procedure for all possible thresholds 1/3 0
40. 40. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = ? FPR = ? How about TPR and FPR?
41. 41. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = 2/3 FPR = 0
42. 42. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = ? FPR = ?
43. 43. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = ? FPR = ? Oops, this is a false positive!
44. 44. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = 2/3 FPR =1/2
45. 45. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) >= 0.7 TPR = 1/3 FPR = 0 1/3 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = 2/3 FPR =1/2 And so on…
46. 46. (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) 1/3 0 >= 0.4 TPR = 3/3 FPR =1/2 >= 0.2 TPR = 3/3 FPR =2/2 >= 0.7 TPR = 1/3 FPR = 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = 2/3 FPR =1/2
47. 47. AUC = 0.83 (1, 1, 0, 1, 0) True labels TPR = TP/P FPR = FP/(FP + TN) TPR FPR (0.7,0.6,0.5,0.4,0.2) 1/3 0 >= 0.4 TPR = 3/3 FPR =1/2 >= 0.2 TPR = 3/3 FPR =2/2 >= 0.7 TPR = 1/3 FPR = 0 >= 0.6 TPR = 2/3 FPR = 0 >= 0.5 TPR = 2/3 FPR =1/2 AUC is considered to be more adequate performance measure than accuracy
48. 48. (1, 1, 0, 1, 0) True labelsTPR FPR (0.7,0.6,0.5,0.4,0.2) 0 AUC is considered to be more adequate performance measure than accuracy AUC = 0.5 AUC of 0.5 means random guess
49. 49. (1, 1, 0, 1, 0) True labelsTPR FPR (0.7,0.6,0.5,0.4,0.2) 0 AUC is considered to be more adequate performance measure than accuracy AUC = 1 AUC of 0.5 means random guess AUC of 1 means perfect classiﬁcation
50. 50. (1, 1, 0, 1, 0) True labelsTPR FPR (0.7,0.6,0.5,0.4,0.2) 0 AUC is considered to be more adequate performance measure than accuracy AUC = 1 AUC of 0.5 means random guess AUC of 1 means perfect classiﬁcation overﬁtting 🙄
51. 51. References • Machine Learning by Andrew Ng (https://www.coursera.org/learn/machine- learning) • Introduction to Machine Learning by Pascal Vincent given at Deep Learning Summer School, Montreal 2015 (http://videolectures.net/ deeplearning2015_vincent_machine_learning/) • Welcome to Machine Learning by Konstantin Tretyakov delivered at AACIMP Summer School 2015 (http://kt.era.ee/lectures/aacimp2015/1-intro.pdf) • Stanford CS class: Convolutional Neural Networks for Visual Recognition by Andrej Karpathy (http://cs231n.github.io/) • Data Mining Course by Jaak Vilo at University of Tartu (https://courses.cs.ut.ee/ MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf) • Machine Learning Essential Conepts by Ilya Kuzovkin (https:// www.slideshare.net/iljakuzovkin) • From the brain to deep learning and back by Raul Vicente Zafra and Ilya Kuzovkin (http://www.uttv.ee/naita?id=23585&keel=eng)
52. 52. www.biit.cs.ut.ee www.ut.ee www.quretec.ee
53. 53. You, guys, rock!