SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Machine Learning
Jon Mead
Technical Services Director, North America
Egress Software Technologies, Inc.
June 14, 2019
Addressing the Disillusionment to Bring Actual
Business Benefit
About the Speaker
Jon Mead
Technical Services Director North America, Egress Software
An experienced technical engineer, Jon has worked across corporate
and government organizations to effectively deploy and manage
SaaS technologies in complex environments. As Technical Services
Director for North America, Jon provides expert technical support
and guidance to Egress clients as they achieve key compliance and
business objectives. Working closely with strategic personnel at
Egress, Jon plays an integral part in the development and delivery of
the company’s innovative data security platform that empowers
users to send, receive and manage information without risk.
A leader in intelligent, user-centric data security
• A decade of success in sophisticated defense, government and private sector data privacy.
• Identify, Classify, Secure, Control, Monitor, Audit & Report
• 2000+ Enterprise customers across industry:
• US Headquarters in Boston, MA.
• Vetted, certified products and services (NIST, ISO, NATO, Common Criteria)
Banking and
insurance
Government Healthcare Non-profit Professional
services
Industry
regulators
Utilities
About Egress
Machine Learning: Where do we begin?
» Define the real-world problem
 Is there a problem to solve?
 Can we solve the problem?
 Should we solve the problem?
 What data do we need to solve this problem?
 Can/Should we use Machine Learning?
The rise of mistake-driven breaches – 2018 Verizon Data Breach Report*
*53,308 security incidents, 2,216 data breaches, 65 countries, 67 contributors. https://www.verizonenterprise.com/verizon-insights-lab/dbir/
Machine Learning: An Example
» Define the real-world problem
R Is there a problem to solve?
R Can we solve the problem?
R Should we solve the problem?
R What data do we need to solve this problem?
R Can/Should we use Machine Learning?
Business Problem:
 How does an organization handle real-world risks to data as it travels over
untrusted networks to potentially untrusted recipients?
 Can an organization consider human error and/or malicious behavior with
that data?
 Ultimately, how can an organization avoid data breaches and demonstrate
compliance with rigorous data protection regulations, such as CCPA, in the
real-world?
Egress: A Problem worth Machine Learning?
Machine Learning Process
• Define the objective of the Problem Statement
• Data Gathering
• Data Preparation
• Exploratory Data Analysis
• Building a Machine Learning Model
• Model Evaluation & Optimization
• Predictions
Business Machine Learning Process
• Define the Business Objective (Problem)
• Source the appropriate data
• Split the data in a meaningful way
• Select the evaluation metric(s)
• Define all features that may be created from the data
• Train the model
• Feature selection
• Production system
• Feed the model
Define the Business Objective
Source the appropriate data
Split the data
Select the evaluation metrics
Define all features
Train the model
Feature Selection
Create Production Version
 Machine Learning in practice
 Machine Learning in production
 Common Pitfalls when deploying from practice to
production?
 How are these pitfalls defined?
Deploying Machine Learning: Common Pitfalls?
 Sampling Bias
 Data Leakage
 Unknown Unknowns
 Scaling and Normalization
 Impact of Outliers
 Fitting Data
 Overfitting the Model
 Social Engineering
Deploying Machine Learning: Common Pitfalls?
Deploying Machine Learning: Sampling Bias
Symptom-Based Sampling
Truncate Selection
Caveman Effect
 Use Tags and Labels to organize structured data
 Unstructured Data – How do we organize?
 How can we prevent data leakage in our machine
learning model?
Deploying Machine Learning: Data Leakage
 What are the unknown unknowns in Machine
Learning?
 Why are unknown unknowns a problem for Machine
Learning?
 How can we address unknown unknowns in our
machine learning model?
Deploying Machine Learning: Unknown Unknowns
 What is the impact of Scaling in Machine Learning
and how can it hurt our model?
 What is normalization and why should we consider it
when working with Machine Learning?
Deploying Machine Learning: Scaling and Normalization
Deploying Machine Learning: Outliers
Univariate Method
Multivariate Method
Minkowski Error
Deploying Machine Learning: Select the Fitting Data
Deploying Machine Learning: Select the Fitting Data
 Without enough data, organizations are at risk of
overfitting the machine learning model
 Using all the data in the world does not mean that
the developed model is accurate, or even viable
 Complication is impressive, but simplicity is brilliance
Deploying Machine Learning: Overfitting
 What is the impact of Social Engineering in Machine
Learning?
 How can models defend against social engineering
attacks?
Deploying Machine Learning: Social Engineering
Original Business Problem:
 How does an organization handle real-world risks to data as it travels over
untrusted networks to potentially untrusted recipients?
 Can an organization consider human error and/or malicious behavior with
that data?
 Ultimately, how can an organization avoid data breaches and demonstrate
compliance with rigorous data protection regulations, such as CCPA, in the
real-world?
Egress: How did we employ Machine Learning?
» Apply protection and rights management
on-the-fly based on risk
» Protect against the accidental
sharing of data
» Auto-encrypt messages for
other Egress clients
» Increases user engagement
and adoption
Risk-Based Protection: What?
» Analyses previous email communications
to protect from accidental sends
» Calculates a risk score based on domain,
user behaviour and system info
» Applies protection based on
sensitivity of data and risk score
» Uses any email protection,
including TLS, O365, Voltage, etc.
Risk-Based Protection: How?
 Use historical behavior to detect anomalies
 Parallel processing and cloud AI enables
“cognitive” processing of vast quantities of
collected data
 “Graph” databases: Link relationships and past
behaviour to quickly detect anomalies and
pattern changes
 Outcomes change with learning, time, and data
 Analysis of user “cliques” (groups) to detect and
prevent accidents
A New Way: Machine Learning to Detect Errors
 Data Leakage
 Scaling with Machine Learning
 Selecting Appropriate Fitting Data
 Social Engineering
That’s great… but what about all those pitfalls?
 Data Leakage
 Identified left-out data
 Unsupervised Probabilistic Machine Learning
 Historical Behavior with real-time comparison
Egress Data Leakage Resolution
 Scaling with Machine Learning: Serverless Technologies
 What is Serverless?
 Why use serverless?
 Benefits from the serverless architecture in practice with Machine
Learning
Egress Addresses Scaling with Machine Learning
 Fitting Data Problem
 Data Selection and Testing application
 Build several models to develop the Golden Model
 Run parallel models in fitting and in product
 Feed the Machine
Egress Selection of Appropriate Fitting Data
 Organizational Domain Relationship Model
 Behavior-Based Risk Assessment: Why did we use a
problematic approach?
 How did we mitigate the behavior-based risk assessment
model – Eager Update and User-Models
Egress Defending against Social Engineering / Malicious Data Manipulation
Future: What does this mean for our Clients
Data
Privacy
Data Security
NYDFS 23
NYCRR
500*
GDPR CA AB375
2017 2018 2019 ?2020
Feb 2018
Phase 2
Transition
ends. Full
compliance
Sept 2018
Phase 3
NAIC Model SC H4655
Colorado (3
CCR 704-1)
VT 4:4 Vt
Code R. 8:8-
4
CO House
Bill 18-1128
US state
Amended
Laws
Thank you!
Talk to us at the Egress stand.
E: info@egress.com
T: 1-800-732-0746
W: www.egress.com
Twitter: @EgressSoftware
"Despite what most SaaS companies are saying, Machine Learning requires time and
preparation. Whenever you hear the term AI, you must think about the data behind it." -
Alexandre Gonfalonieri, February 2019
Appendix
E: info@egress.com
T: 1-800-732-0746
W: www.egress.com
Twitter: @EgressSoftware
Sources:
• https://www.neuraldesigner.com/blog/3_methods_to_deal_with_outliers
• https://cds.nyu.edu/unknown-unknowns-machine-learning/
• https://elitedatascience.com/model-training
• https://towardsdatascience.com/machine-learning-general-process-8f1b510bd8af
• https://towardsdatascience.com/how-to-build-a-data-set-for-your-machine-learning-project-5b3b871881ac
• https://www.kdnuggets.com/2017/08/understanding-overfitting-meme-supervised-learning.html
• https://towardsdatascience.com/identifying-and-correcting-label-bias-in-machine-learning-ed177d30349e
• https://thenextweb.com/contributors/2018/10/27/4-human-caused-biases-machine-learning/
• https://www.datanami.com/2018/07/18/three-ways-biased-data-can-ruin-your-ml-models/
• https://en.wikipedia.org/wiki/Sampling_bias
• https://towardsdatascience.com/security-and-privacy-considerations-in-artificial-intelligence-machine-
learning-part-5-when-6d6d9f457734
• https://machinelearningmastery.com/data-leakage-machine-learning/
• https://imarticus.org/what-is-machine-learning-and-does-it-matter/
• “Identifying and Correcting Label Bias in Machine Learning”, Heinrich Jiang and Ofir Nachum, 15 Jan 2019
• https://www.kdnuggets.com/2018/12/essence-machine-learning.html

Weitere ähnliche Inhalte

Was ist angesagt?

BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning
BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning
BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning Big Data Week
 
International Technology Adoption & Workforce Issues Study - Middle East Summary
International Technology Adoption & Workforce Issues Study - Middle East SummaryInternational Technology Adoption & Workforce Issues Study - Middle East Summary
International Technology Adoption & Workforce Issues Study - Middle East SummaryCompTIA
 
information-systems-management-packet
information-systems-management-packetinformation-systems-management-packet
information-systems-management-packetDion Walker
 
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...Shift Conference
 
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurney
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurneyCertus Accelerate - Why You Need to Invest in Your Data by Vincent McBurney
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurneyCertus Solutions
 
International Technology Adoption & Workforce Issues Study - Thailand Summary
International Technology Adoption & Workforce Issues Study - Thailand SummaryInternational Technology Adoption & Workforce Issues Study - Thailand Summary
International Technology Adoption & Workforce Issues Study - Thailand SummaryCompTIA
 
Brandon miller final sls
Brandon miller final slsBrandon miller final sls
Brandon miller final slsemeraldboy247
 
CompTIA's 5 Trends Shaping the Tech-Driven Workforce
CompTIA's 5 Trends Shaping the Tech-Driven WorkforceCompTIA's 5 Trends Shaping the Tech-Driven Workforce
CompTIA's 5 Trends Shaping the Tech-Driven WorkforceCompTIA
 
The 2018 Enterprise Cloud Trends Report
The 2018 Enterprise Cloud Trends ReportThe 2018 Enterprise Cloud Trends Report
The 2018 Enterprise Cloud Trends ReportibossCyber
 
2013-ISC2-Global-Information-Security-Workforce-Study
2013-ISC2-Global-Information-Security-Workforce-Study2013-ISC2-Global-Information-Security-Workforce-Study
2013-ISC2-Global-Information-Security-Workforce-StudyTam Nguyen
 
Prof m01-2013 global information security workforce study - final
Prof m01-2013 global information security workforce study - finalProf m01-2013 global information security workforce study - final
Prof m01-2013 global information security workforce study - finalSelectedPresentations
 
2014 Secure Mobility Survey Report
2014 Secure Mobility Survey Report2014 Secure Mobility Survey Report
2014 Secure Mobility Survey ReportDImension Data
 
IFS Says Excel Runs Production
IFS Says Excel Runs ProductionIFS Says Excel Runs Production
IFS Says Excel Runs Productioncharlesrathmann
 
Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 - Mark - Fullbright
 
Algorithmic Bias: Challenges and Opportunities for AI in Healthcare
Algorithmic Bias:  Challenges and Opportunities for AI in HealthcareAlgorithmic Bias:  Challenges and Opportunities for AI in Healthcare
Algorithmic Bias: Challenges and Opportunities for AI in HealthcareGregory Nelson
 
Impact of Artificial Intelligence/Machine Learning on Workforce Capability
Impact of Artificial Intelligence/Machine Learning on Workforce CapabilityImpact of Artificial Intelligence/Machine Learning on Workforce Capability
Impact of Artificial Intelligence/Machine Learning on Workforce CapabilityLearningCafe
 
Data Trends for 2019: Extracting Value from Data
Data Trends for 2019: Extracting Value from DataData Trends for 2019: Extracting Value from Data
Data Trends for 2019: Extracting Value from DataPrecisely
 
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelDecision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelLorien Pratt
 
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...Kelvin Ross
 

Was ist angesagt? (20)

BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning
BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning
BDW16 London - Amjad Zaim, Cognitro Analytics: How Deep is Your Learning
 
International Technology Adoption & Workforce Issues Study - Middle East Summary
International Technology Adoption & Workforce Issues Study - Middle East SummaryInternational Technology Adoption & Workforce Issues Study - Middle East Summary
International Technology Adoption & Workforce Issues Study - Middle East Summary
 
information-systems-management-packet
information-systems-management-packetinformation-systems-management-packet
information-systems-management-packet
 
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...
Shift AI 2020: Deep Learning in Intelligent Process Automation - Slater Victo...
 
Data science - An Introduction
Data science - An IntroductionData science - An Introduction
Data science - An Introduction
 
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurney
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurneyCertus Accelerate - Why You Need to Invest in Your Data by Vincent McBurney
Certus Accelerate - Why You Need to Invest in Your Data by Vincent McBurney
 
International Technology Adoption & Workforce Issues Study - Thailand Summary
International Technology Adoption & Workforce Issues Study - Thailand SummaryInternational Technology Adoption & Workforce Issues Study - Thailand Summary
International Technology Adoption & Workforce Issues Study - Thailand Summary
 
Brandon miller final sls
Brandon miller final slsBrandon miller final sls
Brandon miller final sls
 
CompTIA's 5 Trends Shaping the Tech-Driven Workforce
CompTIA's 5 Trends Shaping the Tech-Driven WorkforceCompTIA's 5 Trends Shaping the Tech-Driven Workforce
CompTIA's 5 Trends Shaping the Tech-Driven Workforce
 
The 2018 Enterprise Cloud Trends Report
The 2018 Enterprise Cloud Trends ReportThe 2018 Enterprise Cloud Trends Report
The 2018 Enterprise Cloud Trends Report
 
2013-ISC2-Global-Information-Security-Workforce-Study
2013-ISC2-Global-Information-Security-Workforce-Study2013-ISC2-Global-Information-Security-Workforce-Study
2013-ISC2-Global-Information-Security-Workforce-Study
 
Prof m01-2013 global information security workforce study - final
Prof m01-2013 global information security workforce study - finalProf m01-2013 global information security workforce study - final
Prof m01-2013 global information security workforce study - final
 
2014 Secure Mobility Survey Report
2014 Secure Mobility Survey Report2014 Secure Mobility Survey Report
2014 Secure Mobility Survey Report
 
IFS Says Excel Runs Production
IFS Says Excel Runs ProductionIFS Says Excel Runs Production
IFS Says Excel Runs Production
 
Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013
 
Algorithmic Bias: Challenges and Opportunities for AI in Healthcare
Algorithmic Bias:  Challenges and Opportunities for AI in HealthcareAlgorithmic Bias:  Challenges and Opportunities for AI in Healthcare
Algorithmic Bias: Challenges and Opportunities for AI in Healthcare
 
Impact of Artificial Intelligence/Machine Learning on Workforce Capability
Impact of Artificial Intelligence/Machine Learning on Workforce CapabilityImpact of Artificial Intelligence/Machine Learning on Workforce Capability
Impact of Artificial Intelligence/Machine Learning on Workforce Capability
 
Data Trends for 2019: Extracting Value from Data
Data Trends for 2019: Extracting Value from DataData Trends for 2019: Extracting Value from Data
Data Trends for 2019: Extracting Value from Data
 
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelDecision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
 
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...
ACS EMERGING & DEEP TECH WEBINAR: THE RISE OF AI AND DATA SCIENCE AND ITS IMP...
 

Ähnlich wie Machine Learning: Addressing the Disillusionment to Bring Actual Business Benefit

“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...Edge AI and Vision Alliance
 
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Cloudera, Inc.
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Test-Driven Machine Learning
Test-Driven Machine LearningTest-Driven Machine Learning
Test-Driven Machine LearningC4Media
 
Team 20 Threat Attack AI Cyber Security Company Decision makin.docx
Team 20 Threat Attack AI Cyber Security Company Decision makin.docxTeam 20 Threat Attack AI Cyber Security Company Decision makin.docx
Team 20 Threat Attack AI Cyber Security Company Decision makin.docxerlindaw
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspectiveSravan Ankaraju
 
Responsible Machine Learning
Responsible Machine LearningResponsible Machine Learning
Responsible Machine LearningEng Teong Cheah
 
Designing for Data Security by Karen Lopez
Designing for Data Security by Karen LopezDesigning for Data Security by Karen Lopez
Designing for Data Security by Karen LopezKaren Lopez
 
Continuing Education Conferance
Continuing Education ConferanceContinuing Education Conferance
Continuing Education ConferanceTommy Riggins
 
Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesDerek Kane
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfTemok IT Services
 
Introduction To Data Science
Introduction To Data Science Introduction To Data Science
Introduction To Data Science PriyaMaurya52
 
Ai in insurance how to automate insurance claim processing with machine lear...
Ai in insurance  how to automate insurance claim processing with machine lear...Ai in insurance  how to automate insurance claim processing with machine lear...
Ai in insurance how to automate insurance claim processing with machine lear...Skyl.ai
 
Cloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningCloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningSatyaKVivek
 
The challenges of big data, how data capable is your business? DQM Group
The challenges of big data, how data capable is your business? DQM Group  The challenges of big data, how data capable is your business? DQM Group
The challenges of big data, how data capable is your business? DQM Group Internet World
 
Data Analytics in Azure Cloud
Data Analytics in Azure CloudData Analytics in Azure Cloud
Data Analytics in Azure CloudMicrosoft Canada
 

Ähnlich wie Machine Learning: Addressing the Disillusionment to Bring Actual Business Benefit (20)

“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
 
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning

 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Test-Driven Machine Learning
Test-Driven Machine LearningTest-Driven Machine Learning
Test-Driven Machine Learning
 
Team 20 Threat Attack AI Cyber Security Company Decision makin.docx
Team 20 Threat Attack AI Cyber Security Company Decision makin.docxTeam 20 Threat Attack AI Cyber Security Company Decision makin.docx
Team 20 Threat Attack AI Cyber Security Company Decision makin.docx
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
 
MACHINE LEARNING – THE WHY, WHAT AND HOW
MACHINE LEARNING –  THE WHY, WHAT AND HOWMACHINE LEARNING –  THE WHY, WHAT AND HOW
MACHINE LEARNING – THE WHY, WHAT AND HOW
 
Responsible Machine Learning
Responsible Machine LearningResponsible Machine Learning
Responsible Machine Learning
 
Designing for Data Security by Karen Lopez
Designing for Data Security by Karen LopezDesigning for Data Security by Karen Lopez
Designing for Data Security by Karen Lopez
 
demo AI ML.pptx
demo AI ML.pptxdemo AI ML.pptx
demo AI ML.pptx
 
Continuing Education Conferance
Continuing Education ConferanceContinuing Education Conferance
Continuing Education Conferance
 
Data Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics CapabilitiesData Science - Part I - Sustaining Predictive Analytics Capabilities
Data Science - Part I - Sustaining Predictive Analytics Capabilities
 
eMStream
eMStreameMStream
eMStream
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
 
Mohammed AL Madhani
Mohammed AL MadhaniMohammed AL Madhani
Mohammed AL Madhani
 
Introduction To Data Science
Introduction To Data Science Introduction To Data Science
Introduction To Data Science
 
Ai in insurance how to automate insurance claim processing with machine lear...
Ai in insurance  how to automate insurance claim processing with machine lear...Ai in insurance  how to automate insurance claim processing with machine lear...
Ai in insurance how to automate insurance claim processing with machine lear...
 
Cloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine LearningCloud-Based IoT Analytics and Machine Learning
Cloud-Based IoT Analytics and Machine Learning
 
The challenges of big data, how data capable is your business? DQM Group
The challenges of big data, how data capable is your business? DQM Group  The challenges of big data, how data capable is your business? DQM Group
The challenges of big data, how data capable is your business? DQM Group
 
Data Analytics in Azure Cloud
Data Analytics in Azure CloudData Analytics in Azure Cloud
Data Analytics in Azure Cloud
 

Kürzlich hochgeladen

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Kürzlich hochgeladen (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Machine Learning: Addressing the Disillusionment to Bring Actual Business Benefit

  • 1.
  • 2. Machine Learning Jon Mead Technical Services Director, North America Egress Software Technologies, Inc. June 14, 2019 Addressing the Disillusionment to Bring Actual Business Benefit
  • 3. About the Speaker Jon Mead Technical Services Director North America, Egress Software An experienced technical engineer, Jon has worked across corporate and government organizations to effectively deploy and manage SaaS technologies in complex environments. As Technical Services Director for North America, Jon provides expert technical support and guidance to Egress clients as they achieve key compliance and business objectives. Working closely with strategic personnel at Egress, Jon plays an integral part in the development and delivery of the company’s innovative data security platform that empowers users to send, receive and manage information without risk.
  • 4. A leader in intelligent, user-centric data security • A decade of success in sophisticated defense, government and private sector data privacy. • Identify, Classify, Secure, Control, Monitor, Audit & Report • 2000+ Enterprise customers across industry: • US Headquarters in Boston, MA. • Vetted, certified products and services (NIST, ISO, NATO, Common Criteria) Banking and insurance Government Healthcare Non-profit Professional services Industry regulators Utilities About Egress
  • 5.
  • 6. Machine Learning: Where do we begin? » Define the real-world problem  Is there a problem to solve?  Can we solve the problem?  Should we solve the problem?  What data do we need to solve this problem?  Can/Should we use Machine Learning?
  • 7. The rise of mistake-driven breaches – 2018 Verizon Data Breach Report* *53,308 security incidents, 2,216 data breaches, 65 countries, 67 contributors. https://www.verizonenterprise.com/verizon-insights-lab/dbir/
  • 8. Machine Learning: An Example » Define the real-world problem R Is there a problem to solve? R Can we solve the problem? R Should we solve the problem? R What data do we need to solve this problem? R Can/Should we use Machine Learning?
  • 9. Business Problem:  How does an organization handle real-world risks to data as it travels over untrusted networks to potentially untrusted recipients?  Can an organization consider human error and/or malicious behavior with that data?  Ultimately, how can an organization avoid data breaches and demonstrate compliance with rigorous data protection regulations, such as CCPA, in the real-world? Egress: A Problem worth Machine Learning?
  • 10. Machine Learning Process • Define the objective of the Problem Statement • Data Gathering • Data Preparation • Exploratory Data Analysis • Building a Machine Learning Model • Model Evaluation & Optimization • Predictions
  • 11. Business Machine Learning Process • Define the Business Objective (Problem) • Source the appropriate data • Split the data in a meaningful way • Select the evaluation metric(s) • Define all features that may be created from the data • Train the model • Feature selection • Production system • Feed the model
  • 12. Define the Business Objective Source the appropriate data Split the data Select the evaluation metrics Define all features Train the model Feature Selection Create Production Version
  • 13.  Machine Learning in practice  Machine Learning in production  Common Pitfalls when deploying from practice to production?  How are these pitfalls defined? Deploying Machine Learning: Common Pitfalls?
  • 14.  Sampling Bias  Data Leakage  Unknown Unknowns  Scaling and Normalization  Impact of Outliers  Fitting Data  Overfitting the Model  Social Engineering Deploying Machine Learning: Common Pitfalls?
  • 15. Deploying Machine Learning: Sampling Bias Symptom-Based Sampling Truncate Selection Caveman Effect
  • 16.  Use Tags and Labels to organize structured data  Unstructured Data – How do we organize?  How can we prevent data leakage in our machine learning model? Deploying Machine Learning: Data Leakage
  • 17.  What are the unknown unknowns in Machine Learning?  Why are unknown unknowns a problem for Machine Learning?  How can we address unknown unknowns in our machine learning model? Deploying Machine Learning: Unknown Unknowns
  • 18.  What is the impact of Scaling in Machine Learning and how can it hurt our model?  What is normalization and why should we consider it when working with Machine Learning? Deploying Machine Learning: Scaling and Normalization
  • 19. Deploying Machine Learning: Outliers Univariate Method Multivariate Method Minkowski Error
  • 20. Deploying Machine Learning: Select the Fitting Data
  • 21. Deploying Machine Learning: Select the Fitting Data
  • 22.  Without enough data, organizations are at risk of overfitting the machine learning model  Using all the data in the world does not mean that the developed model is accurate, or even viable  Complication is impressive, but simplicity is brilliance Deploying Machine Learning: Overfitting
  • 23.  What is the impact of Social Engineering in Machine Learning?  How can models defend against social engineering attacks? Deploying Machine Learning: Social Engineering
  • 24.
  • 25. Original Business Problem:  How does an organization handle real-world risks to data as it travels over untrusted networks to potentially untrusted recipients?  Can an organization consider human error and/or malicious behavior with that data?  Ultimately, how can an organization avoid data breaches and demonstrate compliance with rigorous data protection regulations, such as CCPA, in the real-world? Egress: How did we employ Machine Learning?
  • 26. » Apply protection and rights management on-the-fly based on risk » Protect against the accidental sharing of data » Auto-encrypt messages for other Egress clients » Increases user engagement and adoption Risk-Based Protection: What?
  • 27. » Analyses previous email communications to protect from accidental sends » Calculates a risk score based on domain, user behaviour and system info » Applies protection based on sensitivity of data and risk score » Uses any email protection, including TLS, O365, Voltage, etc. Risk-Based Protection: How?
  • 28.  Use historical behavior to detect anomalies  Parallel processing and cloud AI enables “cognitive” processing of vast quantities of collected data  “Graph” databases: Link relationships and past behaviour to quickly detect anomalies and pattern changes  Outcomes change with learning, time, and data  Analysis of user “cliques” (groups) to detect and prevent accidents A New Way: Machine Learning to Detect Errors
  • 29.  Data Leakage  Scaling with Machine Learning  Selecting Appropriate Fitting Data  Social Engineering That’s great… but what about all those pitfalls?
  • 30.  Data Leakage  Identified left-out data  Unsupervised Probabilistic Machine Learning  Historical Behavior with real-time comparison Egress Data Leakage Resolution
  • 31.  Scaling with Machine Learning: Serverless Technologies  What is Serverless?  Why use serverless?  Benefits from the serverless architecture in practice with Machine Learning Egress Addresses Scaling with Machine Learning
  • 32.  Fitting Data Problem  Data Selection and Testing application  Build several models to develop the Golden Model  Run parallel models in fitting and in product  Feed the Machine Egress Selection of Appropriate Fitting Data
  • 33.  Organizational Domain Relationship Model  Behavior-Based Risk Assessment: Why did we use a problematic approach?  How did we mitigate the behavior-based risk assessment model – Eager Update and User-Models Egress Defending against Social Engineering / Malicious Data Manipulation
  • 34. Future: What does this mean for our Clients Data Privacy Data Security NYDFS 23 NYCRR 500* GDPR CA AB375 2017 2018 2019 ?2020 Feb 2018 Phase 2 Transition ends. Full compliance Sept 2018 Phase 3 NAIC Model SC H4655 Colorado (3 CCR 704-1) VT 4:4 Vt Code R. 8:8- 4 CO House Bill 18-1128 US state Amended Laws
  • 35. Thank you! Talk to us at the Egress stand. E: info@egress.com T: 1-800-732-0746 W: www.egress.com Twitter: @EgressSoftware "Despite what most SaaS companies are saying, Machine Learning requires time and preparation. Whenever you hear the term AI, you must think about the data behind it." - Alexandre Gonfalonieri, February 2019
  • 36. Appendix E: info@egress.com T: 1-800-732-0746 W: www.egress.com Twitter: @EgressSoftware
  • 37. Sources: • https://www.neuraldesigner.com/blog/3_methods_to_deal_with_outliers • https://cds.nyu.edu/unknown-unknowns-machine-learning/ • https://elitedatascience.com/model-training • https://towardsdatascience.com/machine-learning-general-process-8f1b510bd8af • https://towardsdatascience.com/how-to-build-a-data-set-for-your-machine-learning-project-5b3b871881ac • https://www.kdnuggets.com/2017/08/understanding-overfitting-meme-supervised-learning.html • https://towardsdatascience.com/identifying-and-correcting-label-bias-in-machine-learning-ed177d30349e • https://thenextweb.com/contributors/2018/10/27/4-human-caused-biases-machine-learning/ • https://www.datanami.com/2018/07/18/three-ways-biased-data-can-ruin-your-ml-models/ • https://en.wikipedia.org/wiki/Sampling_bias • https://towardsdatascience.com/security-and-privacy-considerations-in-artificial-intelligence-machine- learning-part-5-when-6d6d9f457734 • https://machinelearningmastery.com/data-leakage-machine-learning/ • https://imarticus.org/what-is-machine-learning-and-does-it-matter/ • “Identifying and Correcting Label Bias in Machine Learning”, Heinrich Jiang and Ofir Nachum, 15 Jan 2019 • https://www.kdnuggets.com/2018/12/essence-machine-learning.html