SlideShare ist ein Scribd-Unternehmen logo
1 von 40
How to Lie with Statistics…
Information Security Edition
#circlecitycon
About me
Contact me:
@tdmv
…1954 Edition
There is terror in
numbers. Perhaps we
suffer from a trauma
induced by grade-
school arithmetic.
Darrell Huff
“
”
Survey
Says!
9 out of 10 Households Agree
That Surveys Are Bad
Components of a Survey
The Wrath of Graphs
Security Incidents in 2017
Lost/stolen laptops 12
Lost/Stolen mobile devices 40
Hacking 3
Payment card fraud 21
Unintended disclosure 10
12
403
21
10
Lost/stolen laptops
Lost/Stolen mobile devices
Hacking
Payment card fraud
Unintended disclosure
Security Incidents in 2017
Lost/stolen laptops
Lost/Stolen mobile devices
Hacking
Payment card fraud
Unintended disclosure
Security Incidents in 2017
Lost/stolen laptops
Lost/Stolen mobile devices
Hacking
Payment card fraud
Unintended disclosure
Security Incidents in 2017
Lost/stolen laptops
Lost/Stolen mobile devices
Hacking
Payment card fraud
Unintended disclosure
Security Incidents in 2017
Source:
Yougov.co.uk
Source: Godaddy.com
Source: Reddit.com | You had one job!
Source: http://junkcharts.typepad.com/.a/6a00d8341e992c53ef016300bba0da970d-pi
Source: MacWorld 2008 Keynote
Which line is longer?
HINT: It’s a trick
question
Source: http://marchelassociates.com/data-breach-causes/
Data Breach Causes
400
420
440
460
480
500
520
Malware Infections
Malware Infections: 2017
0
100
200
300
400
500
600
Source: Photo, TheVerge.com
Source: Data, Apple.com Photo, TheVerge.com; extra visualizations, qz.com
Reported Social Engineering Attempts
2013 - 2017
200
210
220
230
240
250
260
270
280
290
2013 2014 2015 2016 2017
0
50
100
150
200
250
300
2013 2014 2015 2016 2017
Source: https://www.techradar.com/reviews/amd-ryzen-threadripper-1950x
Source: https://www.techradar.com/reviews/amd-ryzen-threadripper-1950x
The Semi-
Attached
Figure
with Retsyn!
Did you
know….
Source: Kaspersky Security Bulletin 2016
Did you
know….
Source: Kaspersky Security Bulletin 2016
• 80% of SMB’s who pay
the ransom get their
data back
4 out of 5
SMBs who pay the ransom
always get their
data back
You can’t imply
causation with
correlation!
Number of people who drowned by falling into a
swimming pool
correlates with
Number of films Nicolas Cage has appeared in
Source: Spurious Correlations; www.tylervigen.com
10
12
10
11
13
20
25
40
45
41
40
50
0
10
20
30
40
50
60
0
100
200
300
400
500
600
700
800
900
Lost/Stolen Mobile Devices
# users completed security training
2017
Reported Lost/Stolen Mobile devices
and
Number of users that completed security awareness training
Conclusion:
Statisticulation
Further Reading
• “How to Lie with Statistics” by Darrell Huff
• “The Visual Display of Quantitative
Information” by Edward Tufte
• “How to Measure Anything” by Douglas
Hubbard

Weitere ähnliche Inhalte

Ähnlich wie How to Lie with Statistics, Information Security Edition

Merit Event - Closing the Back Door in Your Systems
Merit Event - Closing the Back Door in Your SystemsMerit Event - Closing the Back Door in Your Systems
Merit Event - Closing the Back Door in Your Systemsmeritnorthwest
 
Conducting Digital Forensics against Crime and Fraud
Conducting Digital Forensics against Crime and FraudConducting Digital Forensics against Crime and Fraud
Conducting Digital Forensics against Crime and FraudGoutama Bachtiar
 
Investigating & proving cybercrime
Investigating & proving cybercrimeInvestigating & proving cybercrime
Investigating & proving cybercrimeJenny Reid
 
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacy
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacyThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacy
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacyThingsConAMS
 
Ame Elliott UX security ThingsCon 2017 workshop
Ame Elliott UX security ThingsCon 2017 workshopAme Elliott UX security ThingsCon 2017 workshop
Ame Elliott UX security ThingsCon 2017 workshopThingsConAMS
 
Cybercrime IN INDIA , LAW AND ORDER
Cybercrime IN INDIA , LAW AND ORDER Cybercrime IN INDIA , LAW AND ORDER
Cybercrime IN INDIA , LAW AND ORDER Sooraj Maurya
 
Social engineering attacks
Social engineering attacksSocial engineering attacks
Social engineering attacksRamiro Cid
 
FNC Free Seminar (public)
FNC Free Seminar (public)FNC Free Seminar (public)
FNC Free Seminar (public)forensicsnation
 
A Contextual Framework For Combating Identity Theft
A Contextual Framework For Combating Identity TheftA Contextual Framework For Combating Identity Theft
A Contextual Framework For Combating Identity TheftMartha Brown
 

Ähnlich wie How to Lie with Statistics, Information Security Edition (10)

Merit Event - Closing the Back Door in Your Systems
Merit Event - Closing the Back Door in Your SystemsMerit Event - Closing the Back Door in Your Systems
Merit Event - Closing the Back Door in Your Systems
 
Social Engineering CSO Survival Guide
Social Engineering CSO Survival GuideSocial Engineering CSO Survival Guide
Social Engineering CSO Survival Guide
 
Conducting Digital Forensics against Crime and Fraud
Conducting Digital Forensics against Crime and FraudConducting Digital Forensics against Crime and Fraud
Conducting Digital Forensics against Crime and Fraud
 
Investigating & proving cybercrime
Investigating & proving cybercrimeInvestigating & proving cybercrime
Investigating & proving cybercrime
 
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacy
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacyThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacy
ThingsConAMS 2017 - Ame Elliott - User Experience: IoT security & privacy
 
Ame Elliott UX security ThingsCon 2017 workshop
Ame Elliott UX security ThingsCon 2017 workshopAme Elliott UX security ThingsCon 2017 workshop
Ame Elliott UX security ThingsCon 2017 workshop
 
Cybercrime IN INDIA , LAW AND ORDER
Cybercrime IN INDIA , LAW AND ORDER Cybercrime IN INDIA , LAW AND ORDER
Cybercrime IN INDIA , LAW AND ORDER
 
Social engineering attacks
Social engineering attacksSocial engineering attacks
Social engineering attacks
 
FNC Free Seminar (public)
FNC Free Seminar (public)FNC Free Seminar (public)
FNC Free Seminar (public)
 
A Contextual Framework For Combating Identity Theft
A Contextual Framework For Combating Identity TheftA Contextual Framework For Combating Identity Theft
A Contextual Framework For Combating Identity Theft
 

Mehr von Tony Martin-Vegue

Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...
Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...
Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...Tony Martin-Vegue
 
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)Tony Martin-Vegue
 
Cybersecurity aspects of blockchain and cryptocurrency
Cybersecurity aspects of blockchain and cryptocurrencyCybersecurity aspects of blockchain and cryptocurrency
Cybersecurity aspects of blockchain and cryptocurrencyTony Martin-Vegue
 
Crowdsourced Probability Estimates: A Field Guide
Crowdsourced Probability Estimates: A Field GuideCrowdsourced Probability Estimates: A Field Guide
Crowdsourced Probability Estimates: A Field GuideTony Martin-Vegue
 
Ransomware & Game Theory: To Pay, or Not to Pay?
Ransomware & Game Theory: To Pay, or Not to Pay?Ransomware & Game Theory: To Pay, or Not to Pay?
Ransomware & Game Theory: To Pay, or Not to Pay?Tony Martin-Vegue
 
Should I Pay or Should I Go? Game Theory and Ransomware
Should I Pay or Should I Go? Game Theory and RansomwareShould I Pay or Should I Go? Game Theory and Ransomware
Should I Pay or Should I Go? Game Theory and RansomwareTony Martin-Vegue
 
Can cyber extortion happen to you? Practical tools for assessing the threat
Can cyber extortion happen to you? Practical tools for assessing the threatCan cyber extortion happen to you? Practical tools for assessing the threat
Can cyber extortion happen to you? Practical tools for assessing the threatTony Martin-Vegue
 
Measuring DDoS Risk using FAIR (Factor Analysis of Information Risk
Measuring DDoS Risk using FAIR (Factor Analysis of Information RiskMeasuring DDoS Risk using FAIR (Factor Analysis of Information Risk
Measuring DDoS Risk using FAIR (Factor Analysis of Information RiskTony Martin-Vegue
 
How to Improve Your Risk Assessments with Attacker-Centric Threat Modeling
How to Improve Your Risk Assessments with Attacker-Centric Threat ModelingHow to Improve Your Risk Assessments with Attacker-Centric Threat Modeling
How to Improve Your Risk Assessments with Attacker-Centric Threat ModelingTony Martin-Vegue
 

Mehr von Tony Martin-Vegue (9)

Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...
Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...
Incentivizing Better Risk Decisions - Lessons from Rogue Actuaries - SIRAcon ...
 
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)
Crowdsourced Probability Estimates: A Field Guide (FAIR Institute)
 
Cybersecurity aspects of blockchain and cryptocurrency
Cybersecurity aspects of blockchain and cryptocurrencyCybersecurity aspects of blockchain and cryptocurrency
Cybersecurity aspects of blockchain and cryptocurrency
 
Crowdsourced Probability Estimates: A Field Guide
Crowdsourced Probability Estimates: A Field GuideCrowdsourced Probability Estimates: A Field Guide
Crowdsourced Probability Estimates: A Field Guide
 
Ransomware & Game Theory: To Pay, or Not to Pay?
Ransomware & Game Theory: To Pay, or Not to Pay?Ransomware & Game Theory: To Pay, or Not to Pay?
Ransomware & Game Theory: To Pay, or Not to Pay?
 
Should I Pay or Should I Go? Game Theory and Ransomware
Should I Pay or Should I Go? Game Theory and RansomwareShould I Pay or Should I Go? Game Theory and Ransomware
Should I Pay or Should I Go? Game Theory and Ransomware
 
Can cyber extortion happen to you? Practical tools for assessing the threat
Can cyber extortion happen to you? Practical tools for assessing the threatCan cyber extortion happen to you? Practical tools for assessing the threat
Can cyber extortion happen to you? Practical tools for assessing the threat
 
Measuring DDoS Risk using FAIR (Factor Analysis of Information Risk
Measuring DDoS Risk using FAIR (Factor Analysis of Information RiskMeasuring DDoS Risk using FAIR (Factor Analysis of Information Risk
Measuring DDoS Risk using FAIR (Factor Analysis of Information Risk
 
How to Improve Your Risk Assessments with Attacker-Centric Threat Modeling
How to Improve Your Risk Assessments with Attacker-Centric Threat ModelingHow to Improve Your Risk Assessments with Attacker-Centric Threat Modeling
How to Improve Your Risk Assessments with Attacker-Centric Threat Modeling
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

How to Lie with Statistics, Information Security Edition

Hinweis der Redaktion

  1. [Ad lib intro to the audience] Let me ask you a question Have you ever read a risk assessment or an vendor security report on the latest security threat, and just felt that something was a little off, the numbers didn't add up? But why question it, right? The authors are experts, right? They wouldn’t be trying to… manipulate me, would they? Or would they?
  2. Before we get too deep, let me introduce myself. My name is Tony Martin-Vegue and I’ve been in the IT and InfoSec field for a little over 20 years. Currently work at lending club – san Francisco based FinTech company. I’m the company’s enterprise security strategist, which basically means I solve really hard security problems with math, econ, decision science and of course some luck Please feel free to reach out to me over twitter
  3. How to Lie with Statistics – written in 1954 by Darrell Huff. Huff was the editor of Better Homes and Gardens in the 40s and 50 and had a lifelong passion for statistics – and even though he was not a statistician, Ironically, this book is the best selling Statistics book of all time. Through his book, he introduced the general public to common ways that statistics are used to manipulate the facts. An example: I would imagine everyone here has seen mad men. So back in 1954 it was it ad men on madison ave. that really drew Huff’s ire. He saw some of the claims they made, misuse of surveys like “9 of of 10 doctors prefer Newport cigarettes” and other tricks with numbers and fact bending. He exposed a lot of this in his 1954 book. What I've done here is take the foundation that the book set over 60 years ago and look at it with an information security perspective. I think you will agree with me that Huff's work is just as relevant today as it was in the past. In this presentation we’re going to go over some classic ways that numbers and statistics are used to bend the truth. First up to bat --
  4. Surveys -- 9 out of 10 households agree that surveys are bad.– which households? Surveys are very common in the information security space. Who doesn't love info graphics? A very effective way to convey complex information in very few words. Infographics are also easiest -- and most common way to steer readers to a particular conclusion What is a survey? A survey is a poll – asking a small group of people a particular question – and then extrapolating the results and applying it to the general population. For example, I read a survey that’s said that 59% of CISO’s experienced cyber incidents in which the attackers were able to defeat their defenses. Now, they makes of the survey didn’t poll all CISO’s – they polled a portion – a sample as it’s called – and extrapolated a generality. Surveys are very common in vendor sponsored security reports – in fact, MOST information security reports we read are build solely on the results of surveys. And as security professionals, we take these reports and read them, lern from them, quote them in steering committee meetings or to senior execs when they ask questions. You may be asked by your CISO to quantify data about a risk – easiest way is through a report based on a survey. It’s hard to find good data and you’re excited when you find it.
  5. Let’s take a look at what surveys are Surveys, such as Gallup polls, seems simple on the surface but it's actually very hard to do correctly The science behind surveys are rooted in math and statistics. You need your survey to be statistically sound. There are three main components of a statistically sound survey: Population What is the group you are studying? How big is the group? Sample size The size group you are surveying. You can’t study the entire population, so you study an sample. A good survey taker will do all they can to ensure the sample size is as representative as the population as possible, Confidence interval The margin or error; (e.g. +/-); larger the sample size, the lower the margin of error This is a statistically sound survey. So what's happens when a survey isn’t statistically sound?
  6. You get bias. When referring to surveys, one would say that a statistic is biased when the statistics is systematically different from the population that is being studied. There are many forms of bias that are found in statistics, and by extension, in surveys. Let’s look probably the most common form of bias, Selection bias – Selection bias occurs when some individuals are more likely to be selected or more likely to participate in a survey than other, biasing the survey. Example of this: You are a vendor and you want to do a study on the impact of DDOS attacks at companies, so you buy an email list and send a survey out to a million people. You tell the would-be survey takers that they HAVE to be in the Information Security profession to complete the survey and if you do, you will get a $50 Amazon gift card. How many of you have received this email? I receive about one a week. The surveys come back and you compile the results and - 89% of Information Security professionals think that distributed denial of service attacks is the biggest threat to their business today. No – not 89% of Information Security professionals – it's 89% of people that actually clicked on the survey link on your spammy email and self identified themselves as a Information Security professionals regardless of their true profession. Additionally – if the survey is offering to enter respondents into a drawing for a $50 Amazon gift card, can the results really be trusted? Are people going in a and literally clicking on anything just to finish the survey? This is selection bias. Want a show you a real life example of this. Eveone is familiar with the Ponemon Institute. they put out an annual survey on the cost of a data breach. Now this report has to be the most-quoted security report of all time. I hear it almost weekly at my company, I hear it on podcasts, I se it in risk analysis – it’s everywhere. But let’s take some of the concepts we’ve reviewed and see it it applied to this report. Now, I have the report here and in the back they disclose non-response and sampling frame bias. There’s also another discllusure, interesting titled “non-statistical results.” Let me read you one sentence, and I am quoting: I’m just going to pause there for a moment and let that soak in. Jack Daniel –founder of bSides – famously tweeted, friends don’t let friends quote ponemon. How to spot survey problems. The biggest red flag is that a survey methodology isn’t stated at all. I can say this about the Ponemon institute – they do clearly describe their methods and limitatoins. Problrm is people don’t read the back Second, no margin of error – if a margin of error, confidence interval, isn’t stated the results are probably non-statistical. Last, Vendor sponsored. This is not the case always – some very very very good, solid reports, such as the Verizon Data Breach Investigation Report, comes from a security vendor – when you see something from someone trying to sell you something, pay extra close attention.
  7. Let’s move into the most common way statistics mislead people; mostly unintentionally in this category, but sometimes on purpose. This method is through data visualization. Taking numbers that might not mean that much to the human eye as a list and representing the data visually in order to help tell the story. There are many ways to visualize data: bar, pies, lines, area, radar, scatter – the list goes on. In this section, I’m going to focus on the most common forms of data visualization in businesses
  8. First, we’re going to look at pie charts – which is the most common form of data viz. If you were to ask data viz experts its happens to be the most hated The reason why theyre hated is that pie charts are very difficult to use properly because they distort reality Here is a data set, security incidents in 2017 displayed in data only, table view. Pretty straightforward – we have 5 data points to visualize. Let’s use Excel and throw this into a 2d pie chart.
  9. Pretty straightforward. Same data we just saw in the table This chart makes the assumption that we know about all security incidents that happened last year, and this is a graphic representation of types of incidents. So how could we manipulate this to tell a different story without change any of the actual numbers? Let’s look
  10. I did a few subtle, sneaky things to change the story here. I didn’t change the underlying data at all, but I’ve decided I want to under-emphasize Hacking incidents. I would rather shift the conversation to lost/stolen laptops and mobile devices. I did three things: Converted the 2d pie to a 3d pie. This right here is best way to emphasize the data in front because the 3d shading makes the green and red slices appear bigger than the slices in the back that do not have 3d shading. Next, I modified the X axis. This essentially spins the pie – I moved the purple slice, Hacking, to the back. This removes 3d shading. This not only makes it LOOK smaller than it should be, proportionally – IT IS SMALLER thank it should be. Lastly, and this is very sneaky, I removed the data labels, the numbers next to each slice. ive removed your frame of reference and the scale. You have no idea what each slice represents – you have to solely trust your eyes to tell you how much each slice makes up part of the whole. Let’s take a little further. Let’s say I REALLY don’t want you to ask me about hacking. I’d rather have a root canal.
  11. If a little manipulation is good, then a lot must be better. I’m going to move the Y axis – this moves the 3d rotation of the pie and makes it looks like you viewing it from the side. The purple slice is way in the back, and again, I didn’t not change any of the underlying data – this is done purely from changing the chart 3d options in Excel. This looks ridiculous, right? But how many of you have seen this ridiculousness in a business setting? I have.
  12. One last pie. I changed my mind. I want to talk about hacking now. I want to overemphasize a slice. Remember, “Hacking” is only 3% of incidents but I made it look much bigger by converting the chart to 3d exploded pie Changed the Y Axis to move the purple slice up front I played with the 3d perspective to make the purple as big as possible. I actually pulled out a ruler and measured the surface area on my monitor and this was as big as I could make it I completely changed the story without changing the underlying data. This begs the question -- should we ever be using pie charts at all. There are some very strong opinions on the subject.
  13. Walter Hickey is a journalist and among many things, he works for Nate Silver’s 538
  14. Edward Tuff-tee – visualization pioneer and statistician. Both him and Walter Hickey think that one should never use pie charts. I am not as much of a pie chart hater as these guys, but I do use them sparingly. If you are gonna use pie charts, keep the following 5 things in mind.
  15. #1 – Pie charts represent the whole of something, 100% - most appropratie when communicating ratios. If you do not know what the whole number is or if your data doesn’t really support being visualized in terms of percentages of a whole, consider a bar or line chart This chart is not a ratio – its data viz of mulptile choices. The slices add up to way over 100%. I would use a bar chart here
  16. #2 – Data represented in text should match the dataviz, in terms of scale and representation between data points. A form of a pie chart is a donut chart. These numbers are differe – 38, 23, 69, but the green donuts are exactly the same
  17. Here’s another example of the data not matching the vizualization
  18. Tip #3 – Math. Your percentages have to add up to 100%. This is also 3d pie – The greem 60% looks as big – if not slightly bigger – than the red slice, 70% This adds up to 193%
  19. #4 – Limit the number of different datasets you are comparing to 3 or 4. Anything more than four results in the reader having a difficult time of visually making comparisons. This chart is impossible to visually make comparasions
  20. #5 , there is no professional reason to ever, ever use a 3d pie chart. It’s the worst. If pie charts are the nickleback of data visualization, then 3d piec types are the Milli Vanilli of pie charts. This is from steve job From 2008 keynote at macworld. Look at Jobs chart -- 19.5% takes up more visual real estate than 21.2. the Green slice, Apple’s market share, is measuralbly bigger. In a 3D pie chart, the items up front will always look bigger than then items in the back.
  21. Let me show you why. Which line is longer the top or the bottom? Here’s a hint – it’s a trick question. This is an optical illusion - both are the same, but THESE end lines create a framing effect. It tricks the mind into perceiving this differently
  22. Same thing here. This 3d pie is an optical illusion. The red takes up measurably more surface area than the orange. They’re both 19% but the red is much bigger. You can take out a tape measurer and measure this – it takes up more surface area.
  23. let’s look at line graphs. I personally really like line and bar graphs and when trying to visualize fairly simple data sets. But it is possible to lie with line graphs. Let’s take a look at how this happens. Ask yourself – how does the graph on the right tell a different story about malware infections The first graph on the left is a pretty standard line chart that represents the data well. It fairly represents a slow but steady rise in malware infections. The second graph on the right tells a totally different story, but I didn’t change any of the underlying data. Looking at this graph you would think we have a huge problem, a malware epidemic. What I did here is change the vertical axis scale. On the left chart, the scale is from 0 to 600; The graph on the right is a different story. I changesdthe lower scale from to start at 400 instead of 0. This makes the changes seem much more dramatic. - Always look at the scale that is being used. A good line graph starts at 0 – if it’s not ask yourself why, what does the change in scale do to the reader’s perception of the data?
  24. Here’s a very misleading line graph variant, filled line, that illustrates a few other ways to manipulate data. This is a from a presentation Tim Cook did illustrating the number of iPhones the firm has sold. Couple problems with this. First, what are we even looking at? In my example of what not to do, I manipulated the vertical scale. In this example, there is no vertical scale at all. it’s just a blue, steep mountain that looks very good – they certainly have sold a lot of iphones. The other problem is using cumulative sales – it’s a little misleading – This leads the user to think this is how many iphones are out there, but this isn’t the case. Iphones break, are thrown away, traded in. I think it’s fine to tell people the cumulative number of units sold but you have to give context. The next picture does this –
  25. A blogger over at quartz, qz,com, overlaid a bar chart on top of the cumulative line chart and added a vertical scale. He essentially “fixed” the graph for Tim Cook by adding scale and context. The story is different
  26. so bar graphs, You’re probably thinking, oh -- Bars graphs too? What’s wrong with bar graphs? Bar graphs do not have some of the inherent errors that pie charts have, and this is why most of the time I choose bar charts to make comparisons when others might use a pie chart. Here are two bar graphs – the underlying data is identical – Does anyone know what I did here? It’s essentially the same thing I did with the line graph. On the right, I changed the scale to start at 200 instead of zero, making the increase in social engineering attempts seem much more drastic visually than the more reasonable graphic on the left.
  27. Here’s another example – this is a vertical bar chart from Tech radar, comparing the performance of AMD versus Intel chips on a game called Total War. This graph would lead the reader to believe that the Intel i9 is nearly three times as fast as the AMD Threadripper! Look closely however – it’s the scale problem again! The scale starts at 67.4 the actual performance difference between the intel chip and the amd chip is less than 2%
  28. Techradar receive much criticirm and derision over the graphics and they have Since fixed it, the scale starts at 0 The difference between the two chips – from a human eye, data viz point of view – is nearly indistinguishable
  29. The semi-attached figure. This is one of my personal favorites because it’s so pervasive everywhere, but it’s also so hard to spot unless you are really looking for it. a semi attached figure is when a proof is given for some claim, but when the reader looks at it closely, the two things are not related. It’s called “semi attached” because the statistic seems attached to a situation, but it isn’t. This is a hard concept to grasp without good examples. Marketing and advertising professionals are masters at the semi-attached figure. Absolute masters. A few examples of things we’ve all seen.
  30. Does anyone remember the Certs commercials from a few years back? At the end of every commercial, the narrator said “Certs, with Retsyn.” What the hell is Retsyn? It sounds good – it sounds great! Sounds like it will make my stinky breath less stinky. That’s semi-attached – ok, you the audience, need proof as to why you should buy Certs. OK here’s the proof – Certs has Retsyn. OK. Another example is when a marketing claim states, 25% better. 25% better than what? So, see how this is used? You have a statistics – such as 25% - being offered as proof for something, but the two items are not attached or relevant.
  31. Heres this marketing clam – unbreakable linux? I remember clearly the first time I saw it. I was driving on the 101 in sf and I saw a huge billboard, and I just shook my head and didn’t really know what to do or say about this ridiculous claim. What this claim was referring to is a product called Oracle Linux, which is based on Red Hat. This is classic semi attached – the vendor makes a statement, such as “unbreakable” and leads the reader to associate the statement with a piece of software and pretends it’s the same thing. Of course the software isn’t unbreakable and it’s been subject to many of the same vulnerabilities Linux has had over the year. This reminds me so much of “now with retsyn.” Unbreakable. Oh ok, let’s buy a linux distro that cannot be – what? Hacked? Experience downtime? Patched without rebooting? Does this refer to high availability? Oracle still markets their distro as Unbreakable, but has backtracked quite a bit and admits this is a marketing tagline and does not refer to any specific feature or attribute
  32. Let’ take a look at another example of the semi attached figure. True story. I was sitting in a vendor sales pitch a while back, and the vendor put this graph up on the screen. This is the number of cybersecurity incidents reported by federal agencies from 2006 to 2015. Now this is a fine graph, no problems with it. The vendor was selling next generation firewall technology As the people in the room fell silent to the stark reality on the chart before us, the vendor started their pitch Look at this graph – from 2006 to today, cyberattacks have increased over 10 fold! We’re at war. This is proof that we’re at cyberwar and you must protect yourself The current equipment you have cannot project your company from these types of unrelenting, sophisticated attacks The salesman went on and on and on. I love stuff like this. I love it when vendors build their pitch around a house of cards, one tap and it all falls apart. Does anyone see a semi-attached figure with all of this – remember a semi attached figure is a situation in which, a claim cannot be proven so an unrelated statistic is given and the speaker pretends it’s the same thing. In this situation, the vendor was trying to lead us to a path to believe that the sky is falling. Maybe it is, maybe it isn’t – some of the other talks I attended here at ccc would certainly lead me to believe that there is some doom on the horizon, but this graph has nothing to do with it. Let’s examine this situation and ask a few probing questions. Ok it would appear that cyberattacks have increased 10x since 2006. Why? Are there more computers in 2015 than in 2006? Are there more websites to attack? You are not hearing the full story here. What is the ratio of attack surface versus attacks? Is detection of attacks better in 2015 than it was in 2006, meaning we have the ability to detect and measure a larger range of attacks? Most importantly – what are you measuring? What do you consider an attack? This graph is from 2015 -- Are you curious about what was reported in 2016?
  33. I have good news -- WE WON the CYBER! Not really. The fed govt changed the definition of a cyber attack in 2016. They no longer consider a simple port scan an attack. they just changed what they were measuring – changed the unit of measurement
  34. There’s a related statistical bait and switch. I grabbed this from a Kaspersky antivirus infographic. Did you know that one in 5 small/medium sized business that pay ransom, NEVER GOT THEIR DATA BACK! OH NO! This is called the framing effect. You frame statistics in emotional words or graphics to influence the reader. The use of the word NEVER is a weasel word. They’re taking an objective stat and interjecting subtle opinion in it. The reader thinks OMG, never!!! NEVER MY DATA BACK! What should I do?
  35. Let flip this around. This is the Same base statistics but I reworded it. 80% - that actually sounds like a lot of people get their data back right? Now let me use the same type of graphic, same base statistics, and a weasel word – just like Kaspersky – but flip it – READ IT Wow that sounds great right? It should be no surprise to anyone that Kaspersky – a company that sell ransomware mitigation software – uses the framing effect
  36. This is called “correlation does not imply causation” also known as the post hoc fallacy. The explanation is simple enough – just because two data correlate it does not necessarily mean one causes the other. A logical fallacy, in practices, is called a “spurious correlation” – a term coined by statistician Karl Pearson. Let’s take a look at a few examples of spurious correlations.
  37. This is from the website of Tyler Vigen and he has analyzed hundreds of sets of data and found really amazing, weird correlations. Like this one here – number of people who drowned by falling into a swimming pool correlates with number of films Nicolas Cage has appeared in. Not security related but worth checking out if you are interested in logical fallacies Just another example of why you need to question everything. Let’s and look at another example.
  38. I was in a room with a few auditors and other co-workers in the security department I worked in. The lead auditor was concerned – very concerned – about a rash of BYOD mobile devices being reported lost/stolen to our security operations center. The auditor was positive he knew the reason why we had such an uptick in this type of activity and took the liberty of creating a chart, very similar to this. I’ve omitted the company name and changed the data points to protect the guilty. The auditor looked grimly at me and said, the security awareness you rolled out – it’s causing this. You are teaching people what is confidential information and why it needs to be protected. You are telling people the value of sensitive company data and they are stealing it! His reasoning was that before the awareness training, people were blissfully ignorant and now they were smart abut the kinds of very valuable data they had on their phones so they were basically stealing their own own phones to exfiltrate data. This combo chart certainly support his hypothesis, doesn’t it? There is a direct correlation between the rollout of security awareness training and and/stolen mobile devices. Makes sense? Or does it? Is there a correlation or is this a spurious correlation? Post hoc fallacies occur when people are not careful enough when they reason. Data is not an end to itself, data is used to construct and test a hypothesis. Even just a casual investigation is enough to avoid committing this a majority of the time. In conclusion, of course security awareness training didn’t cause people to steal their own phones. I performed an investigation and called some of the people that reported lost devices to the SOC and asked a few questions about the circumstances around the incident. To make a long story short, there was a correlation between the two data points, but exactly the opposite of the lead auditor alleged. People were simply not aware of the requirement to report lost/stolen BYOD devices to security and the training educated them to this fact. The uptick we see here isn’t caused by the training, but rather an effect of the training.
  39. VI. Conclusion Let’s wrap this up with a few tips and resources. Darrell Huff called the manipulation of statistics sta-tis-ta-cu-lation Now hopefully you have some additional tools in your toolbox, I want to encourage every one of you to assume good intentions. Most people are not out there to deceive other people – they just don’t know or they don’t understand some of the underlying concepts of statistictics, logical fallacies or data visualization. Next, always look at the source of the data and check it out yourself, if possible. If there is not a source or methodology given for the data, do not trust it. Finally, do not believe everything you see or read just because it seems sciency or has a lot of data. Stephen Cobert has a word for this – truthiness, a “truth” that is so because it feels right, without regard to evidence, facts or logic.
  40. Thank you everyone for coming today. Here are a few additional resources you can read.