Operationalizing security data science for the cloud: Challenges, solutions, and trade-offs

Choosing
the Learner
Binary Classification
Regression
Multiclass Classification
Unsupervised
Ranking
Anomaly Detection
Collaborative Filtering
Sequence Prediction
Reinforcement Learning
Representation Learning

Choosing the Learning Task
•Binary Classification
•Anomaly Detector
•Ranking
Defining Data Input
• Data Loaders (text, binary, SVM light, Transpose
loader)
•Data type
Applying Data Transforms
•Cleaning Missing data
•Dealing with categorical data
•Dealing with text data
•Data Normalization
Choosing the Learner
•Binary Classification
•Regression
•Multi class
•Unsupervised
•Ranking
•Anomaly Detection
•Collaborative Filtering
•Sequence Prediction
Choosing Output
•Save Features of a model?
•Save the model as text?
•Save Model as binary?
•Save the per-instance results?
Choosing Run Options
•Run Locally?
•Run distributed on HPC cluster?
•Are all paths in the experiment node-accessible?
•Priority?
•Max Concurrent Process?
View Results
•Too large?
•Sampled
•Right size
•Load data
•Histogram
•Per feature
•Sampled Instances
Debug and Visualize Errors
•Error in Data
•Error in Learner
•Error in Optimizer
•Error in Experimentation setup
Analyze Model Predictions
•Root cause analysis
•Grading

• Binary Classification
• Anomaly Detector
• Ranking
Defining Data Input
• Data Loaders (text, binary,
SVM light, Transpose loader)
• Data type
• Cleaning Missing data
• Dealing with categorical data
• Dealing with text data
• Data Normalization
• Regression
• Multi class
• Unsupervised
• Ranking
• Anomaly Detection
• Collaborative Filtering
• Sequence Prediction
Choosing Output
• Save Features of a model?
• Save the model as text?
• Save Model as binary?
• Save the per-instance results?
• Run Locally?
• Run distributed on HPC cluster?
• Are all paths in the
experiment node-accessible?
• Priority?
• Max Concurrent Process?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results
• Ranking
Defining Data Input
• Data type
• Regression
• Multi class
• Unsupervised
• Ranking
Choosing Output
• Run Locally?
• Priority?
View Results

Operationalizing Security
Data Science
Ram Shankar Siva Kumar (@ram_ssk)
Andrew Wicker
Microsoft

Security Data Science Projects are different
• Traditional Programming Projects: spec/prototype → implement → ship
• Data Science Projects: at each stage: relabel, refeaturize, retrain
• With data-driven features, all components drift:
• Learner: more accurate/faster/lower-memory-footprint/…
• Features: there are always better ones
• Data: all distributions drift
• Security Projects: at each stage: assess threat, build detections, respond
• All components drift:
• Threat: new attacks constantly come out;
• Detection: newer log sources
• Response: better tooling, newer TSGs
Intro Model Evaluation Model Deployment Model Scale-out Conclusion
So wait…when do
we ship??

You ship when your solution is operational
Security
Experts
Engineers
Legal
Service
Engineers
Product
Managers
Machine
Learning
Experts

Operational is more than your “model is working”…
Detect unusual user activity to
prevent data exfiltration
Detect unusual user activity using
Application logs, with false
positive rate < 1%, for all Azure
customers, in near real-time

Detect unusual user activity
using Application logs,
with false positive rate < 1%,
for all Azure Customers
in near real-time
=> The Problem
=> Data
=> Model Evaluation
=> Model Deployment
=> Model Scale-out
Operationalize Security Data Science: Components

Model Evaluation
How do you know your system works?

Model Evaluation
Metrics
Model Usage
Metrics
Model Validation
Metrics
• E.g: False Positive
• Makes your customer (and ergo,
your business) happy
• How to measure this?
• E.g: Call Rate
• How much is the model in use?
• Makes your division happy
• Collected by your pipeline after
deployment
• E.g: MSE, Reconstruction error….
• How well does the model
generalize?
• Makes the data scientist happy
• Comes pre-built with ML
framework (Scikit learn, CNTK)

Model Evaluation: How to gather Evaluation
dataset?
• Good: Use Benchmark datasets
• List of curated datasets - www.secrepo.com
• Con: Remember – attackers have ‘em too!
• Better: Use previous Indicators of Compromise
• Honeypots, commercial IOC feeds
• Steps:
• Gather confirmed IOCs
• “Backprop” them through the generated alerts
• This will help you calculate FP and FN
• Best: Curate your own dataset
MoreSpecialized

Curating your own dataset options
1. Inject Fake Malicious data
Model
Synthetic
data
Storage
How: Label data as “eviluser” and check if “eviluser” pops
to the top of the reports every day
Pro: Low overhead—you don’t have to depend on a red
team to test your detection
Con: The injected data may not be representative of true
attacker activity
Storage
Alerting
System C

2. Employ Commonly Used Attacker Tools
How: Spin up a malicious process using
Metasploit, Powersploit, or Veil in your environment.
Look for traces in your logs
Pro: Easy to implement; your development team, with
little tutorial, can run the tool, which would generate
attack data in the logs.
Con: The machine learning system, will only learn to
detect known attacker toolkits and not generalize over
the attack methodology
Model
Storage
Tainted
Data
Alerting
System

3. Red Team pentests your environment
How: a red team attacks the system and we try
to get the logs from the attacks, as tainted data
Pro: Closest technique to real-world attacks
Con: Red Teams are point in time exercises;
expensive
Model
Storage
Tainted
Data
Alerting
System

Growing your dataset: Generative Adversarial Networks
Source: https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-
of-code-pytorch-e81b79659e3f#.djcfc6eo0 Source: http://www.evolvingai.org/ppgn

Model Deployment
Tailoring alerts based on customers geographic location

Azure has data centers all around the world!

Localization affects Model Building
• Privacy Laws vary across the board
• IP address is treated as EII in some regions vs. not EII in other regions
• “Anyone logging into corporate network at midnight during the
weekend is anomalous”
• Weekend in Middle East != Weekend in Americas
• Seasonality varies

Option 1: Shotgun Deployment
• How: Deploy same model code
across different regions
• Pros:
• Easy deployment;
• Uniform metrics
• Single TSG to debug all service incidents
• Cons:
• Lose macro trends in favor of micro
trends
• Model-Region Incompatibility Region
1
Region
2
Region
3
Model ModelModel

Option 2: Tiered Modeling
• How:
• Federated Models
• Each region is modeled separately
• Results are scrubbed according to compliance
laws and privacy agreements
• Scrubbed results are used as input to “Model
Prime”
• Model Prime
• Results are collated to search for global trends
• Pros:
• Bespoke modeling for every region
• Balance between Micro and Macro modeling
• Cons:
• Complicated Deployment
• Depending on the agreements, model-prime
may not be possible
Region1 Region2 Region3
Model 1
Model - Prime
Model 2 Model
3
Scrubbed
Results

Detecting Malicious Activities
Detect risky or malicious activity
in SharePoint Online activity logs
with precision > 90%
for all SPO users
in near real-time
=> The Problem
=> Data
=> Model Evaluation
=> Model Deployment
=> Model Scale-out

Exploratory Analysis
• Typical data science work:
• Sample data
• Script for preprocessing data
• Summary statistics
• Script for evaluating approaches
• All done locally on dev machine
using R/Python
• Facilitates quick turn around
• Avoids having to debug at scale

Model Evaluation
• Labels from known incidents and investigations
• Inject labels by mimicking malicious activity
• SPO team helps us understand the malicious activity
• Red team helps us simulate the malicious activity
• > 90% precision

Model: Bayesian Network
• Probabilistic Graphical Model
• Related to GMM, CRF, MRF
• Represents variables and conditional
independence assertions in a directed
acyclic graph
• Directed edges encode conditional
dependencies
• Conditional probability distributions for
each variable
Burglary
Alarm
Mary
Calls
John
Calls
Earthquake

Initial Prototype – v0.1
• One activity model for all users
• Run model in cloud environment with
Azure Worker Role
• Storage accounts for input data and
output scores
• Pros:
• Easy to manage
• Small memory footprint
• Cons:
• Does not scale
• Low throughput
Data
Scores
Azure
Worker
Role
Activity
Model
User 1
User 2
User 3

Improved Approach
• One model for each user
• Personalized activity suspiciousness
• Cluster low-activity users for better
model results
• Replace storage accounts with
Azure Event Hubs
• Low-latency, cloud-scale “queues”
Azure
Worker
Role
User 1
User 2
User 3
Event
Hub
Event
Hub
Model
1
Model
2
Model
3
Model
n
…
Scores

Model Scale-Out: Memory
Azure
Worker
Role
User 1
User 2
User 3
Event
Hub
Event
Hub
Model
1
Model
2
Model
3
Model
n
…
Scores
Model Storage
• Millions of per-user models
• More than can fit in worker
role memory
• Store models in storage
account
• Load as needed

Model Scale-Out: Latency
Azure
Worker
Role
User 1
User 2
User 3
Event
Hub
Event
Hub
Model
1
Model
2
Model
3
Model
n
…
Scores
Model Storage
Redis
Cache
• Model storage account adds
too much latency
• Redis cache minimizes model
loading latency
• LRU policy as we process user
activity events

Data Compliance
• Models can not use certain PII
• Balkanized cloud environments
• Tiered model development
• Resolve user information for UX
• UserID -> User Name

Data Compliance
Azure
Worker
Role
User 1
User 2
User 3
Event
Hub
Event
Hub
Model
1
Model
2
Model
3
Model
n
…
Scores
Model Storage
Redis
Cache
User Account DB
Redis
Cache

Cloud Resource Competition
Signal
1
Signal
2
Signal
3
Signal
m
User Account DB
Redis
Cache

From v0.1 to v1.0

Operationalize Security Data Science: Components
=> Model Evaluation
=> Model Deployment
=> Model Scale-out

The Rand Test
Test to see if your Security Data Science solution operational
Answer Yes/No to the following:
1) Do you have an established pipeline to collect relevant security data?
2) Do you have established SLAs/data contracts with partner teams?
3) Can you seamlessly update the model with new features and re-train?
4) Did you evaluate the model with real attack data?
5) Does your model respect different privacy laws, across all regions?
6) Do you account for model localization?
7) Is your model scalable, end to end?
8) Do you hold live site meetings about your solution?
9) Can security responders leverage the model for insights during an
investigation?
10) Do you have a framework to collect feedback from security
analysts/feedback on the results?
By @ram_ssk, Andrew Wicker
Score - Yes = 1 point
10
5
0
All systems Operational!
Houston! We have a
problem
One small step…
Model Evaluation Model Deployment Model Scale-out

Operationalizing security data science for the cloud: Challenges, solutions, and trade-offs

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Operationalizing security data science for the cloud: Challenges, solutions, and trade-offs

Ähnlich wie Operationalizing security data science for the cloud: Challenges, solutions, and trade-offs (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Operationalizing security data science for the cloud: Challenges, solutions, and trade-offs