More Related Content Similar to Splunk AI & Machine Learning Roundtable 2019 - Zurich (20) Splunk AI & Machine Learning Roundtable 2019 - Zurich1. © 2 0 1 9 S P L U N K I N C .
Splunk Artificial Intelligence &
Machine Learning Roundtable
Zurich, November 6, 2019
Philipp Drieger | Staff Machine Learning Architect
2. © 2 0 1 9 S P L U N K I N C .
During the course of this presentation, we may make forward-looking statements regarding future events or
the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC.
The forward-looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or accurate
information. We do not assume any obligation to update any forward-looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change
at any time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in
the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2019 Splunk Inc. All rights reserved.
Forward-Looking Statements
THIS SLIDE IS REQUIRED, DO NOT DELETE
3. © 2 0 1 9 S P L U N K I N C .
Agenda 1) Roundtable quick Intros
2) Introduction to AI and ML Features in Splunk
3) Customer Use Cases
4) Live Demo of Machine Learning Toolkit, with examples:
Methods for Anomaly Detection
Predictive Analytics and Forecasting
Clustering
5) Custom Machine Learning, including:
Expansion with MLSPLAPI
Advanced Containerization
6) Panel and Q&A
7) Networking Lunch
4. © 2 0 1 9 S P L U N K I N C .
• | where _time @ Splunk > 4.5y
• Previous:
• +15y in research, software development, visual arts
• +3y SE across portfolio & domains in CEMEA & EE
• Specializations
• Anomaly Detection, Data Mining, NLP, Advanced
Analytics and Visualizations
• Applied Data Science, Machine Learning, Graph
Theory and Network Science
• GPU Computing, Deep Learning
• Role @ Splunk
• Staff Machine Learning Architect (Central EMEA)
• Author of DGA App for Splunk
• Author of MLTK Container for Splunk
• Author of Deep Learning Toolkit for Splunk
• Blog posts, conf talks, hackathons etc.
• Ensure Customer and Partner Success with ML
Philipp Drieger
5. © 2 0 1 9 S P L U N K I N C .
Intro
6. © 2 0 1 9 S P L U N K I N C .
Our World
Never Stops
Evolving.
New Ideas. New Devices. New Processes.
© 2 0 1 9 S P L U N K I N C .
7. © 2 0 1 9 S P L U N K I N C .
* Idc- Data Age 2025: The Digitization Of The World- November 2018
Every Company Has a
Universe of Real-time Data
Creating More Opportunities and
Threats than Ever Before
New Data
Streams &
Devices
New Apps &
App Logs
Financial
Account &
Operating
Systems
Database
Logs
Network
Logs
New
Technology
ATM
Sensor
Data
Transaction
Data
Proxy
Data
Firewall
Logs
© 2 0 1 9 S P L U N K I N C .
8. © 2 0 1 9 S P L U N K I N C .
Turning
Real-time
Data Into
Action
is Hard
Data
Lakes
Master Data
Management
ETL
Point Data
Management
Solutions
Data
Silos
© 2 0 1 9 S P L U N K I N C .
9. © 2 0 1 9 S P L U N K I N C .
IT
Security
IoT
Biz
Analytics
The
Data-to-Everything
Platform
© 2 0 1 9 S P L U N K I N C .
10. © 2 0 1 9 S P L U N K I N C .
Any Structure
Any Source
Any Time Scale
ACT
INVESTIGATEANALYZE
MONITOR
IT
Security
IoT
Biz
Analytics
© 2 0 1 9 S P L U N K I N C .
11. © 2 0 1 9 S P L U N K I N C .
Splunk: The Data-to-Everything Platform
Bring data to every question, decision and action
Cloud Monitoring
Application Lifecycle
Analytics
Application Release
Analytics
Container Monitoring
Infrastructure
Monitoring
Advanced Threat
Detection
Insider Threats
Incident Investigation
and Forensics
SOC Automation
Compliance
Real-Time Monitoring
and Diagnostics
ICS Security
Predictive Analytics
Facilities Management
Business Process
Mining
Customer Experience
Optimization
Incident Management
Digital Marketing
Optimization
IoT Biz AnalyticsIT Security
12. © 2 0 1 9 S P L U N K I N C .
Intro AI | ML | DL
13. © 2019 SPLUNK INC.
“Humans are good at Learning…
but we get lost in volume and detail.”
14. © 2 0 1 9 S P L U N K I N C .
AI, ML, DL
“A Function that maps features to
an output” = AI
“A Function that learns patterns
in your data without being
explicitly programmed” = ML
Types of ML
Supervised
Unsupervised
Reinforcement
Lots of opinions exist. Myths as well…
15. © 2 0 1 9 S P L U N K I N C .
What ML & AI are not
Machine Learning is not MagicAI
Bu
zzGarbage Data = Useless Predictions
• Data Scientists spend 80% of their time
cleaning, munging and collecting data
• Throwing more data at an algorithm will
not result in solving all of your SOC
issues
• Machine Learning requires a solid
understanding of statistics and the
scientific method
ML & AI require you to understand the
fundamental business problem you want
to solve.
16. © 2 0 1 9 S P L U N K I N C .
What ML & AI are not
Machine Learning is not Magic
ML is not a replacement for
expert analysts, or engineers.
ML requires Subject Matter
Experts to enhance security &
IT operations.
Analysts are required to
provide feedback to the models
to adjust thresholding rules and
reduce false positives.
AI
Bu
zz
17. © 2 0 1 9 S P L U N K I N C .
Problem: DGA domains are computer
generated pseudo-random character
strings used by attackers, blacklisting
an infinite number of domains is not
feasible.
Hypothesis: “Are there patterns in
domain generation algorithms that can
be exploited to identify newly
generated domains as threats in real-
time?”
Example Domains:
Machine Learning & AI
What does the scientific method look like in the IT & Security Space?
http://87hfdredwertyfdvvlkgdrsadm.net/af/GHFbfsalku65
http://87hfdredwertyfdvvlkgdrsadm.net/af/sdgLKJvgh
http://wszystkodokuchni.pl/34f43
18. © 2018 SPLUNK INC.
Why Use Machine Learning? : MTTR
$ Impact
Predictive
Proactive
(add logs and metrics)
Effective
$ Impact
Existing
Events
NEGATIVE
MTTR!!
Predict 30 Minutes
in Advance
Time Return
to Business
Cost of
Impact
Reactively Alerted
MTTR
Automated Resolution
MTTR
MTTR
Splunk ML Alert
Basic Value prop of Splunk
One layer of ML, finding anomalies in real time + ^ Splunk
A 2nd Layer of ML +^ Anomalies +^ Splunk
19. © 2 0 1 9 S P L U N K I N C .
Machine
Learning Tour
20. © 2 0 1 9 S P L U N K I N C .
What Data Scientists Really Do
Data Preparation accounts for about 80% of the work of data scientists
“Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says”, Forbes Mar 23, 2016
21. © 2 0 1 9 S P L U N K I N C .
Splunk Customers Want Answers from their Data
► Deviation from past behavior
► Deviation from peers
► (aka Multivariate AD or Cohesive AD)
► Unusual change in features
► Identify peer groups
► Event Correlation
► Reduce alert noise
► Behavioral Analytics
Anomaly detection Predictive Analytics Clustering
► Predict Service Health Score/Churn
► Predicting Events
► Trend Forecasting
► Detecting influencing entities
► Early warning of failure
22. © 2 0 1 9 S P L U N K I N C .
Skill Areas for Machine Learning @ Splunk
Domain
Expertise
(IT, Security…)
Data
Science
Expertise
Splunk
Expertise
MLTK
Splunk ML Toolkit
facilitates and simplifies
via examples & guidance
Premium solutions provide out
of the box ML capabilities.
ITSI,
UBA
• Statistics/math background
• Algorithm selection
• Model building
• Identify use cases
• Drive decisions
• Understanding of business impact
• Searching
• Reporting
• Alerting
• Workflow
23. © 2 0 1 9 S P L U N K I N C .
Overview of Machine Learning at Splunk
CORE PLATFORM
SEARCH + Smarter
Splunk
PACKAGED PREMIUM
SOLUTIONS
MACHINE LEARNING
TOOLKIT
Platform for Operational Intelligence
24. © 2 0 1 9 S P L U N K I N C .
Machine Learning in ITSI
IT Service Intelligence
Adaptive Thresholds
Anomaly Detection
Cohesion Detection
Predictive Analytics
Clustered Notable
Events
Automated Actions
Assisted Deep Dive
InvestigationApplication
logs
Network logs
Metrics
Server logs
Time Series
in Splunk
INTELLIGENCE
KPIs
MLTK Customization
Machine
Learning Machine
Learning
25. © 2 0 1 9 S P L U N K I N C .
Finding Outliers
Adaptive Thresholding:
• Learn baselines & dynamic thresholds
• Alert & act on deviations
• Manage for 1000s of KPIs & entities
• Stdev/Avg, Quartile/Median, Range
Trending/Cohesive Anomaly Detection:
• Find “hiccups” in expected patterns
• Catches deviations beyond thresholds
• Advanced proprietary algorithms
IT Service Intelligence
26. © 2 0 1 9 S P L U N K I N C .
Event Analytics
Prioritize event insights with
service context, logs & metrics
Group related events to highlight
the most meaningful ones
Reduce noise and alert on
root causes of issues
Use ML algorithms to group
similar events (Smart Mode)
IT Service Intelligence
27. © 2 0 1 9 S P L U N K I N C .
Machine Learning in Splunk UBA
60+ ANOMALY
CLASSIFICATIONS
20+ THREAT
CLASSIFICATIONS
Machine
Learning
Suspicious Data
Movement
Unusual Machine
Access
Flight Risk User
Unusual Network
Activity
Machine Generated
Beacon
Lateral Movement
Suspicious Behavior
Compromised User
Account
Data Exfiltration
Malware Activity
Endpoint logs
Server logs
Identity logs
Machine
Learning
DATA
SOURCES
28. © 2 0 1 9 S P L U N K I N C .
Sophisticated Security Modeling in UBA
How does it look?
60+ Batch
Models
• 165+ Detections
• 60+ Anomaly Types
• IOCs
• Contextual
Intelligence
• Entity Scoring
Specialized Threat
Models
20+ Threat Types
Raw Events
15+
Streaming
Models
Aggregated
Events
Kill-chain
Analysis
Graph Analysis
Custom Threats
29. © 2 0 1 9 S P L U N K I N C .
Splunk Machine Learning Toolkit (MLTK)
Built for the Citizen Data Scientist
• Experiments and Assistants: Guided model building,
testing, and deployment for common objectives
• Algorithms: 80+ standard algorithms (supervised &
unsupervised)
Extensible to operationalize any use case
• Python for Scientific Computing Library:
Access to 300+ open source algorithms
• Deep Learning Toolkit : Supports NN and GPU
accelerated machine learning
• ML-SPL API: Import any open-source or proprietary
algorithm
Extends Splunk to operationalize Machine Learning
30. © 2 0 1 9 S P L U N K I N C .
Custom ML with the Splunk Platform
Visualize &
Share
Clean &
Munge
Operationalize
Monitor Alert
Search &
Explore
Collect
Data
Build, Test,
Improve Models
Ecosystem MLTK
Choose
Algorithm
Ecosystem
Splunk Splunk
Splunk
Splunk
MLTK
Splunk
Ecosystem
Splunk
Operationalized Data Science Pipeline
Ecosystem
MLTK
Splunk
Splunk’s App Ecosystem contains 1000’s of free add-ons for getting data in,
applying structure and visualizing your data giving you faster time to value.
The Machine Learning Toolkit delivers new SPL commands, custom
visualizations, assistants, and examples to explore a variety of ml concepts.
Splunk Enterprise is the mission-critical platform for indexing, searching,
analyzing, alerting and visualizing machine data.
Pre-processing
Feature Selection
MLTK
Splunk
MLTK
Splunk
Platform for Operational Intelligence
31. © 2 0 1 9 S P L U N K I N C .
Customer
Success
Stories
32. © 2 0 1 9 S P L U N K I N C .
Recent Customer Success Stories @ .conf19
Enhanced Anomaly
Detection: Join T-Mobile
and Splunk as we Deep
Dive an Enterprise-IT
Operational Use Case
Add value to your SIEM:
how Israel's Ministry of
Energy applies Machine
Learning to protect their
Critical Infrastructure and
OT Operations
Augment Your Security
Monitoring Use Cases
with Splunk's Machine
Learning Toolkit
T-Mobile (US)
Ministry of Energy,
State of Israel SIEMENS AG
Learn more at conf.splunk.com with over 900+ presentations available online!
33. © 2 0 1 9 S P L U N K I N C .
1) Get help from the Splunk Data Scientists
to solve your business use case with
Machine Learning Toolkit
2) Complimentary support with your
Enterprise or Cloud license
3) Early access to new Machine Learning
features
4) Results in opportunity to tell your success
story with Splunk
5) Contact mlprogram@splunk.com for more
information or your Splunk account team
Splunk
Machine
Learning
Advisory
Program
34. © 2 0 1 9 S P L U N K I N C .
Splunk MLAdvisory Customers
35. © 2 0 1 9 S P L U N K I N C .
What‘s new in
MLTK 5.0
36. © 2019 SPLUNK INC.
Machine
Learning
Toolkit 5.0
New capabilities continue to
make machine learning easily
accessible by more users and
extensible with connectors
• Easier to navigate with a new, modern
showcase layout
• Smarter with the introduction of the
new Smart Outlier Detection
Assistant for anomaly detection
• Migration to Python 3
• Applicable to more use cases with the
Smart Forecasting Assistant with
Multivariate Forecasts and Special
Days Effects
37. © 2 0 1 9 S P L U N K I N C .
Deploying and
Applying ML
with Splunk
38. © 2 0 1 9 S P L U N K I N C .
Continuous Data Ingest at Scale
DevelopVisualize PredictAlertSearch
Engineers Data
Analysts
Security
Analysts
Business
Users
Native Inputs
TCP, UDP, Logs, Scripts, Wire, Mobile
Industrial Data
SCADA, AMI, Meter Reads
Modular Inputs
MQTT, AMQP, COAP, REST, JMS
HTTP Event Collector
Token Authenticated Events
Technology Partnerships
Kepware, AWS IoT, Cisco, Palo Alto
Maintenance
Info
Asset
Info
Data
Stores
External
Lookups/EnrichmentOT
Industrial Assets
IT
Consumer and
Mobile Devices Real Time
39. © 2 0 1 9 S P L U N K I N C .
Every Search Can Use Machine Learning
Search
Third-Party
Applications
Smartphones
and Devices
Tickets
Email
Send an
email
File a
ticket
Send a text
Flash lights
Trigger
process flow
AlertReal Time
OT
Industrial Assets
IT
Consumer and
Mobile Devices
40. © 2 0 1 9 S P L U N K I N C .
MLTK + Python for Scientific Computing
persisted model
SearchReal Time
Visualize
Alert
| fit y from x* into “model”
| apply “model”
…
Python for Scientific Computing
OT
Industrial Assets
IT
Consumer and
Mobile Devices
41. © 2 0 1 9 S P L U N K I N C .
Deep Learning Toolkit for Splunk
persisted model
SearchReal Time
Visualize
Alert
| fit y from x* into “model”
| apply “model”
…
OT
Industrial Assets
IT
Consumer and
Mobile Devices
42. © 2 0 1 9 S P L U N K I N C .
Live Demo Splunk
Machine Learning
Toolkit (MLTK)
43. Philipp Drieger
Staff Machine Learning Architect, Splunk
Announcing the Deep
Learning Toolkit for Splunk
with TensorFlow 2.0,
PyTorch, NLP and
Jupyter Lab Notebooks
44. © 2 0 1 9 S P L U N K I N C .
Seamlessly Integrate with
Splunk Enterprise and
Machine Learning Toolkit
Workflows
Freedom of Code within
Jupyter Lab Notebooks for
Advanced Modelling with
TensorFlow and PyTorch
GPU accelerated Deep
Learning for Compute
Intensive Training Workloads
Key Benefits of the MLTK Container
45. © 2 0 1 9 S P L U N K I N C .
46. © 2 0 1 9 S P L U N K I N C .
47. © 2 0 1 9 S P L U N K I N C .
48. © 2 0 1 9 S P L U N K I N C .
49. © 2 0 1 9 S P L U N K I N C .
50. © 2 0 1 9 S P L U N K I N C .
51. © 2 0 1 9 S P L U N K I N C .
52. © 2 0 1 9 S P L U N K I N C .
53. © 2 0 1 9 S P L U N K I N C .
54. © 2 0 1 9 S P L U N K I N C .
55. © 2019 SPLUNK INC.
1. Extend your Splunk platform with the
Deep Learning Toolkit for Splunk
2. Integrate custom advanced deep learning
and NLP models into Splunk using a
predefined Jupyter Notebook workflow for
rapid model development.
3. Leverage GPUs for compute intense
training tasks
Deep Learning Toolkit
for Splunk
Key
Takeaways
56. © 2 0 1 9 S P L U N K I N C .
Outlook: new
products
announced at
.conf19
Data Stream Processor (DSP)
57. © 2 0 1 9 S P L U N K I N C .
Splunk Data Stream Processor
Log Files
Online
Shopping Cart
Cell Phones
and Devices
RFID
Messaging
Patient
Generated
Data
Servers
Web Services
Call Detail
Records
Protect sensitive data
Take action on data in
motion
Turn raw data into high-
value information
Distribute data to
Splunk or other
destinations
Filter
Format
Enrich
Mask Sensitive Data
Detect data patterns or conditions
Aggregate
Normalize Transform
Track and monitor pipeline health
Splunk Data Stream Processor
A real-time stream processing solution that collects, processes and delivers data to
Splunk and other destinations in milliseconds
Data Warehouse
Public Cloud
Message Bus
58. © 2 0 1 9 S P L U N K I N C .
Use Cases
Filter out or route
noisy data to
specific destinations
Data
Routing
Filtering/
Noise
Removal
Data
Formatting
Guarantee delivery of
high-volume, high-
velocity data to multiple
destinations
Format or organize data
using various functions
based on specified
conditions
Aggregate data based on
specific conditions and
identify abnormal patterns
in data
Data
Aggregation
DATA IN MOTION
59. © 2 0 1 9 S P L U N K I N C .
Introducing Unbounded ML in DSP
Streaming Analytics : Derive insights while data is still in motion
● Automatic Detection of
patterns and anomalies in
raw logs
● Advanced pattern matching
● Sequential Outlier detection
● Multi-source correlation
Derive insights on
data in motion
Continuous Intelligence
● Algorithms that learn
continuously
● No downtime machine
learning systems
● Unbounded in cardinality of
models and data volume
Advanced Analytics
● Online classification,
clustering, time series
forecasting, changepoint
detection etc baked in
● Self tuning algorithms, no
manual hyper parameter
tuning needed
60. © 2 0 1 9 S P L U N K I N C .
Anomaly
Detection on
Stream.
General
Questions:
DSP-SplunkNext@splunk.com