SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Henry Pak
Solutions Architect
Machine Learning for the Elastic Stack
Anomalies in your data could indicate trouble
1
Spiked 404 errors
Web attack
IT Operational Analytics Security Analytics Business Analytics
Unusual DNS activity
Data exfiltration
Rare log messages
Failing sensor
Operational Analytics
• Is my website seeing unusual traffic volume?
• Are bots or attackers visiting my website?
• Do I worry about the database errors in my logs?
Use Case
Security Analytics
• Has my system been compromised by malware?
• Could one of my users be an insider threat?
• Is there indication of data theft in my DNS logs?
Use Case
Telemetry / Sensors
 Is the unusual latency spike from a ISP outage?
 Which trucks in my fleet show unusual driving pattern?
 Does this rare event type indicate a failing sensor?
Use Case
5
Detecting (noteworthy) anomalies is hard!
• Data is complex, high dimensional, fast moving
• Human inspection is not practical
• Easy to miss things
Visual
inspection is
not practical
Where’s the anomaly?
6
Detecting (noteworthy) anomalies is hard!
• Defining “normal” via static thresholds is hard
• Rules don’t evolve with data / infrastructure
• Rules can be bypassed
Rule-based
alerts are
insufficient
What’s the right threshold ?
X-Pack solves this with automated anomaly detection
• Uses unsupervised machine learning techniques to
 Learn what’s “normal” by modeling historic behavior
 Detect anomalies when data falls outside expected bounds
7
X-Pack solves this with automated anomaly detection
• Unsupervised techniques - no manual training / input needed
• Evolves with the data - “online” model learns continuously
• Influencer detection - accelerates root cause identification
8
Detect anomalies of different types
• Time series - single / multiple
• Outliers in population (using entity profiling)
• Rare / unusual rates in “categories” of events
9
Anomalies in temporal pattern
• Single (univariate) time series
Example: Is there unusual traffic on website ?
10
Time
Metric
Anomalies in temporal pattern
• Multiple time series
 Multiple metrics
 Single metric split by a field;
• Each series modeled
independently
Example:
Is there unusual web activity
from any country?
11
Time
Metric
USAUKFranceChina
Outliers in population (using entity profiling)
• Create a profile for a “typical” entity (server, user, IP, etc.) in a population
• Detects entities (outlier) that deviate from the typical profile
Example:
• Which IP address is not like the others?
(indication of a bot / attacker)
12
Outliers in population (using entity profiling)
• Create a profile for a “typical” entity (server, user, IP, etc.) in a population
• Detects entities (outlier) that deviate from the typical profile
Example:
• Which IP address is not like the others?
(indication of a bot / attacker)
13
Unusual or rare events (via log categorization)
14
• Classify raw messages into groups based on similarity
• Models frequencies of each message category over time
• Spot anomalous in message groups
Example:
• Do my application logs contain unusual messages
DEMO
15

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Bde presentatie bakker_bart_20170920
Bde presentatie bakker_bart_20170920Bde presentatie bakker_bart_20170920
Bde presentatie bakker_bart_20170920
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Incident response on a shoestring budget
Incident response on a shoestring budgetIncident response on a shoestring budget
Incident response on a shoestring budget
 
Building Blocks Big Data Expo
Building Blocks Big Data ExpoBuilding Blocks Big Data Expo
Building Blocks Big Data Expo
 
De groote de man Ingrid de Poorter
De groote de man Ingrid de PoorterDe groote de man Ingrid de Poorter
De groote de man Ingrid de Poorter
 
Bde presentation dv
Bde presentation dvBde presentation dv
Bde presentation dv
 
If-If-If-If
If-If-If-IfIf-If-If-If
If-If-If-If
 
Crossyn
CrossynCrossyn
Crossyn
 
What should I do when my website got hack?
What should I do when my website got hack?What should I do when my website got hack?
What should I do when my website got hack?
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1
 
Presentatie big data expo swarovski
Presentatie big data expo swarovskiPresentatie big data expo swarovski
Presentatie big data expo swarovski
 
Java start01 in 2hours
Java start01 in 2hoursJava start01 in 2hours
Java start01 in 2hours
 
Datasnap web client
Datasnap web clientDatasnap web client
Datasnap web client
 
NUS-ISS Learning Day 2016 - Big Data Analytics
NUS-ISS Learning Day 2016 - Big Data AnalyticsNUS-ISS Learning Day 2016 - Big Data Analytics
NUS-ISS Learning Day 2016 - Big Data Analytics
 
Polar Bears Mario
Polar Bears MarioPolar Bears Mario
Polar Bears Mario
 
Picnic Big Data Expo
Picnic Big Data ExpoPicnic Big Data Expo
Picnic Big Data Expo
 
Mainframe Customer Education Webcast: New Ironstream Facilities for Enhanced ...
Mainframe Customer Education Webcast: New Ironstream Facilities for Enhanced ...Mainframe Customer Education Webcast: New Ironstream Facilities for Enhanced ...
Mainframe Customer Education Webcast: New Ironstream Facilities for Enhanced ...
 
Travelbird
TravelbirdTravelbird
Travelbird
 
ProRail Laurens Koppenol & Paul van der Voort
ProRail Laurens Koppenol & Paul van der VoortProRail Laurens Koppenol & Paul van der Voort
ProRail Laurens Koppenol & Paul van der Voort
 
Elasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautésElasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautés
 

Ähnlich wie Anomaly Detection in Time-Series Data using the Elastic Stack by Henry Pak

Vest Forensics presentation owasp benelux days 2012 leuven
Vest Forensics presentation owasp benelux days 2012 leuvenVest Forensics presentation owasp benelux days 2012 leuven
Vest Forensics presentation owasp benelux days 2012 leuven
Marc Hullegie
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
Raffael Marty
 

Ähnlich wie Anomaly Detection in Time-Series Data using the Elastic Stack by Henry Pak (20)

IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)IT Operation Analytic for security- MiSSconf(sp1)
IT Operation Analytic for security- MiSSconf(sp1)
 
Technical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvertTechnical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvert
 
CNIT 152 11 Analysis Methodology
CNIT 152 11 Analysis MethodologyCNIT 152 11 Analysis Methodology
CNIT 152 11 Analysis Methodology
 
Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)Building a Next-Generation Security Operations Center (SOC)
Building a Next-Generation Security Operations Center (SOC)
 
IDS for Security Analysts: How to Get Actionable Insights from your IDS
IDS for Security Analysts: How to Get Actionable Insights from your IDSIDS for Security Analysts: How to Get Actionable Insights from your IDS
IDS for Security Analysts: How to Get Actionable Insights from your IDS
 
11 Analysis Methodology
11 Analysis Methodology11 Analysis Methodology
11 Analysis Methodology
 
intrusion detection system (IDS)
intrusion detection system (IDS)intrusion detection system (IDS)
intrusion detection system (IDS)
 
Vest Forensics presentation owasp benelux days 2012 leuven
Vest Forensics presentation owasp benelux days 2012 leuvenVest Forensics presentation owasp benelux days 2012 leuven
Vest Forensics presentation owasp benelux days 2012 leuven
 
Nicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in ElasticsearchNicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in Elasticsearch
 
Memory forensics and incident response
Memory forensics and incident responseMemory forensics and incident response
Memory forensics and incident response
 
All your logs are belong to you!
All your logs are belong to you!All your logs are belong to you!
All your logs are belong to you!
 
All Your Security Events Are Belong to ... You!
All Your Security Events Are Belong to ... You!All Your Security Events Are Belong to ... You!
All Your Security Events Are Belong to ... You!
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
 
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private LimitedThreat Hunting by Falgun Rathod - Cyber Octet Private Limited
Threat Hunting by Falgun Rathod - Cyber Octet Private Limited
 
Defcon 23 - damon small - beyond the scan
Defcon 23 - damon small - beyond the scanDefcon 23 - damon small - beyond the scan
Defcon 23 - damon small - beyond the scan
 
Splunk live! Customer Presentation – Prelert
Splunk live! Customer Presentation – PrelertSplunk live! Customer Presentation – Prelert
Splunk live! Customer Presentation – Prelert
 
Log Mining: Beyond Log Analysis
Log Mining: Beyond Log AnalysisLog Mining: Beyond Log Analysis
Log Mining: Beyond Log Analysis
 
SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs  SpiceWorks Webinar: Whose logs, what logs, why logs
SpiceWorks Webinar: Whose logs, what logs, why logs
 
CNIT 121: 11 Analysis Methodology
CNIT 121: 11 Analysis MethodologyCNIT 121: 11 Analysis Methodology
CNIT 121: 11 Analysis Methodology
 
Identify and Stop Insider Threats
Identify and Stop Insider ThreatsIdentify and Stop Insider Threats
Identify and Stop Insider Threats
 

Mehr von Data Con LA

Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 

Mehr von Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Anomaly Detection in Time-Series Data using the Elastic Stack by Henry Pak

  • 1. Henry Pak Solutions Architect Machine Learning for the Elastic Stack
  • 2. Anomalies in your data could indicate trouble 1 Spiked 404 errors Web attack IT Operational Analytics Security Analytics Business Analytics Unusual DNS activity Data exfiltration Rare log messages Failing sensor
  • 3. Operational Analytics • Is my website seeing unusual traffic volume? • Are bots or attackers visiting my website? • Do I worry about the database errors in my logs? Use Case
  • 4. Security Analytics • Has my system been compromised by malware? • Could one of my users be an insider threat? • Is there indication of data theft in my DNS logs? Use Case
  • 5. Telemetry / Sensors  Is the unusual latency spike from a ISP outage?  Which trucks in my fleet show unusual driving pattern?  Does this rare event type indicate a failing sensor? Use Case
  • 6. 5 Detecting (noteworthy) anomalies is hard! • Data is complex, high dimensional, fast moving • Human inspection is not practical • Easy to miss things Visual inspection is not practical Where’s the anomaly?
  • 7. 6 Detecting (noteworthy) anomalies is hard! • Defining “normal” via static thresholds is hard • Rules don’t evolve with data / infrastructure • Rules can be bypassed Rule-based alerts are insufficient What’s the right threshold ?
  • 8. X-Pack solves this with automated anomaly detection • Uses unsupervised machine learning techniques to  Learn what’s “normal” by modeling historic behavior  Detect anomalies when data falls outside expected bounds 7
  • 9. X-Pack solves this with automated anomaly detection • Unsupervised techniques - no manual training / input needed • Evolves with the data - “online” model learns continuously • Influencer detection - accelerates root cause identification 8
  • 10. Detect anomalies of different types • Time series - single / multiple • Outliers in population (using entity profiling) • Rare / unusual rates in “categories” of events 9
  • 11. Anomalies in temporal pattern • Single (univariate) time series Example: Is there unusual traffic on website ? 10 Time Metric
  • 12. Anomalies in temporal pattern • Multiple time series  Multiple metrics  Single metric split by a field; • Each series modeled independently Example: Is there unusual web activity from any country? 11 Time Metric USAUKFranceChina
  • 13. Outliers in population (using entity profiling) • Create a profile for a “typical” entity (server, user, IP, etc.) in a population • Detects entities (outlier) that deviate from the typical profile Example: • Which IP address is not like the others? (indication of a bot / attacker) 12
  • 14. Outliers in population (using entity profiling) • Create a profile for a “typical” entity (server, user, IP, etc.) in a population • Detects entities (outlier) that deviate from the typical profile Example: • Which IP address is not like the others? (indication of a bot / attacker) 13
  • 15. Unusual or rare events (via log categorization) 14 • Classify raw messages into groups based on similarity • Models frequencies of each message category over time • Spot anomalous in message groups Example: • Do my application logs contain unusual messages