SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Self-learning Systems for Cyber Security1
Kim Hammar & Rolf Stadler
kimham@kth.se & stadler@kth.se
Division of Network and Systems Engineering
KTH Royal Institute of Technology
October 15, 2020
1
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020) (CNSM 2020). Izmir,
Turkey, Nov. 2020.
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 1 / 12
Game Learning Programs
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 2 / 12
Challenges: Evolving and Automated Attacks
Challenges:
Evolving & automated attacks
Complex infrastructures
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Goal: Automation and Learning
Challenges
Evolving & automated attacks
Complex infrastructures
Our Goal:
Automate security tasks
Adapt to changing attack methods
Security Agent Controller
Observations Actions
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Approach: Game Model & Reinforcement Learning
Challenges:
Evolving & automated attacks
Complex infrastructures
Our Goal:
Automate security tasks
Adapt to changing attack methods
Our Approach:
Model network as Markov Game
MG = hS, A1, . . . , AN , T , R1, . . . RN i
Compute policies π for MG
Incorporate π in self-learning systems Security Agent Controller
st+1,
rt+1
at
π(a|s; θ)
172.18.1.0/24
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
Related Work
Game-Learning Programs:
TD-Gammon2
, AlphaGo Zero3
, OpenAI Five etc.
=⇒ Impressive empirical results of RL and self-play
Network Security:
Automated threat modeling4
, automated intrusion detection etc.
=⇒ Need for automation and better security tooling
Game Theory:
Network Security: A Decision and Game-Theoretic Approach5
.
=⇒ Many security operations involves strategic decision making
2
Gerald Tesauro. “TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play”. In:
Neural Comput. 6.2 (Mar. 1994), 215–219. ISSN: 0899-7667. DOI: 10.1162/neco.1994.6.2.215. URL:
https://doi.org/10.1162/neco.1994.6.2.215.
3
David Silver et al. “Mastering the game of Go without human knowledge”. In: Nature 550 (Oct. 2017),
pp. 354–. URL: http://dx.doi.org/10.1038/nature24270.
4
Pontus Johnson, Robert Lagerström, and Mathias Ekstedt. “A Meta Language for Threat Modeling and Attack
Simulations”. In: Proceedings of the 13th International Conference on Availability, Reliability and Security. ARES
2018. Hamburg, Germany: Association for Computing Machinery, 2018. ISBN: 9781450364485. DOI:
10.1145/3230833.3232799. URL: https://doi.org/10.1145/3230833.3232799.
5
Tansu Alpcan and Tamer Basar. Network Security: A Decision and Game-Theoretic Approach. 1st. USA:
Cambridge University Press, 2010. ISBN: 0521119324.
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 4 / 12
Outline
Use Case
Markov Game Model for Intrusion Prevention
Reinforcement Learning Problem
Method
Results
Conclusions
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 5 / 12
Use Case: Intrusion Prevention
A Defender owns a network infrastructure
Consists of connected components
Components run network services
Defends by monitoring and patching
An Attacker seeks to intrude on the infrastructure
Has a partial view of the infrastructure
Wants to compromise a specific component
Attacks by reconnaissance and exploitation
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 6 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
(5) Game Dynamics T , R1, R2, ρ0
4/0 0/1 8/0 ?/0
?/0
?/0
0/0 9/0 8/0
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Markov Game Model for Intrusion Prevention
(1) Network Infrastructure (2) Graph G = hN, Ei
Ndata
Nstart
(3) State space |S| = (w + 1)|N|·m·(m+1)
SA
3 = [0, 0, 0, 0]
SD
3 = [9, 1, 5, 8, 1]
SA
0 = [0, 0, 0, 0]
SD
0 = [0, 0, 0, 0, 0]
SA
1 = [0, 0, 0, 0]
SD
1 = [1, 9, 9, 8, 1]
SA
2 = [0, 0, 0, 0]
SD
2 = [9, 9, 9, 8, 1]
(4) Action space |A| = |N| · (m + 1)
?/0 ?/0 ?/0 ?/0
?/0
?/0
?/0 ?/0 ?/0
(5) Game Dynamics T , R1, R2, ρ0
4/0 0/1 8/0 ?/0
?/0
?/0
0/0 9/0 8/0
∴
- Markov game
- Zero-sum
- 2 players
- Partially observed
- Stochastic elements
- Round-based
MG = hS, A1, A2, T , R1, R2, γ, ρ0i
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
Automatic Learning of Security Strategies
1
2 2 2
q1 = 3 q2 = 4 q3 = 6
(18,18) (15,20) (9,18)
q2 = 3
q2 = 4
q2 = 6
(20,15) (15,16) (8,12)
q2 = 3
q2 = 4
q2 = 6
(18,9) (12,8) (0,0)
q2 = 3
q2 = 4
q2 = 6
Finding strategies for the Markov game model:
Evolutionary methods
Computational game theory
Self-Play Reinforcement learning
Attacker vs Defender
Strategies evolve without human intervention
Motivation for Reinforcement Learning:
Strong empirical results in related work
Can adapt to new attack methods and threats
Can be used for complex domains that are hard to model exactly
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 8 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
The Reinforcement Learning Problem
Goal:
Approximate π∗
= arg maxπ E
hPT
t=0 γt
rt+1
i
Learning Algorithm:
Represent π by πθ
Define objective J(θ) = Eo∼ρπθ ,a∼πθ
[R]
Maximize J(θ) by stochastic gradient ascent with gradient
∇θJ(θ) = Eo∼ρπθ ,a∼πθ
[∇θ log πθ(a|o)Aπθ
(o, a)]
Domain-Specific Challenges:
Partial observability: captured in the model
Large state space |S| = (w + 1)|N|·m·(m+1)
Large action space |A| = |N| · (m + 1)
Non-stationary Environment due to presence of adversary
Agent
Environment
at
st+1
rt+1
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Our Reinforcement Learning Method
Policy Gradient & Function Approximation
To deal with large state space S
πθ parameterized by weights θ ∈ Rd
of NN.
PPO & REINFORCE (stochastic π)
Auto-Regressive Policy Representation
To deal with large action space A
To minimize interference
π(a, n|o) = π(a|n, o) · π(n|o)
Opponent Pool
To avoid overfitting
Want agent to learn a general strategy
Attacker π(a|s; θ2)
Defender π(a|s; θ1)
st+1
rt+1
ha1
t , a2
t i
Opponent pool
s0 s1 s2 . . .
Markov Game
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
Experimentation: Learning from Zero Knowledge
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 3
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.3
0.4
0.5
0.6
0.7
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 2
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 3
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 3
REINFORCE
PPO-AR
PPO
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
Experimentation: Learning from Zero Knowledge
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Attacker
win
probability
AttackerAgent vs DefendMinimal, scenario 3
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.3
0.4
0.5
0.6
0.7
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 2
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
0.8
Attacker
win
probability
DefenderAgent vs AttackMaximal, scenario 3
REINFORCE
PPO AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 1
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 2
REINFORCE
PPO-AR
PPO
0 2000 4000 6000 8000 10000
# Iteration
0.0
0.2
0.4
0.6
Attacker
win
probability
Attacker vs Defender self-play training, scenario 3
REINFORCE
PPO-AR
PPO
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
Conclusion & Future Work
Conclusions:
We have proposed a Method to automatically find security strategies
=⇒ Model as Markov game & evolve strategies using self-play
reinforcement learning
Addressed domain-specific challenges with Auto-regressive policy,
opponent pool, and function approximation.
Challenges of applied reinforcement learning
Stable convergence remains a challenge
Sample-efficiency is a problem
Generalization is a challenge
Current & Future Work:
Study techniques for mitigation of identified RL challenges
Learn security strategies by interacion with a cyber range
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12
Thank you
All code for reproducing the results is open source:
https://github.com/Limmen/gym-idsgame
Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12

Weitere ähnliche Inhalte

Was ist angesagt?

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsSeonho Park
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Mokhtar SELLAMI
 
Cari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufoCari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufoMokhtar SELLAMI
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
 
Georgetown B-school Talk 2021
Georgetown B-school Talk  2021Georgetown B-school Talk  2021
Georgetown B-school Talk 2021Charles Martin
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916Craig Chao
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkMila, Université de Montréal
 
News from Mahout
News from MahoutNews from Mahout
News from MahoutTed Dunning
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSean Moran
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkKazuki Fujikawa
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningGiancarlo Frison
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector MachineLucas Xu
 

Was ist angesagt? (20)

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Stochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithmsStochastic optimization from mirror descent to recent algorithms
Stochastic optimization from mirror descent to recent algorithms
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Alpha Go: in few slides
Alpha Go: in few slidesAlpha Go: in few slides
Alpha Go: in few slides
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
 
Cari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufoCari presentation maurice-tchoupe-joskelngoufo
Cari presentation maurice-tchoupe-joskelngoufo
 
Linear Classifiers
Linear ClassifiersLinear Classifiers
Linear Classifiers
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
Georgetown B-school Talk 2021
Georgetown B-school Talk  2021Georgetown B-school Talk  2021
Georgetown B-school Talk 2021
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916
 
Large scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using sparkLarge scale logistic regression and linear support vector machines using spark
Large scale logistic regression and linear support vector machines using spark
 
News from Mahout
News from MahoutNews from Mahout
News from Mahout
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 

Ähnlich wie Cdis hammar stadler_15_oct_2020

Security of Artificial Intelligence
Security of Artificial IntelligenceSecurity of Artificial Intelligence
Security of Artificial IntelligenceFederico Cerutti
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networksGuillaume Costeseque
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfPolytechnique Montréal
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networksGuillaume Costeseque
 
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser BootsmaDSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser BootsmaDeltares
 
MM framework for RL
MM framework for RLMM framework for RL
MM framework for RLSung Yub Kim
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsGuillaume Costeseque
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to AlgorithmsVenkatesh Iyer
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesMatt Moores
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduceDmitrii Ignatov
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesIJERA Editor
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Gael Varoquaux
 
Wireless Localization: Positioning
Wireless Localization: PositioningWireless Localization: Positioning
Wireless Localization: PositioningStefano Severi
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networksGuillaume Costeseque
 

Ähnlich wie Cdis hammar stadler_15_oct_2020 (20)

2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
 
Security of Artificial Intelligence
Security of Artificial IntelligenceSecurity of Artificial Intelligence
Security of Artificial Intelligence
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
Triggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphsTriggering patterns of topology changes in dynamic attributed graphs
Triggering patterns of topology changes in dynamic attributed graphs
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser BootsmaDSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
 
MM framework for RL
MM framework for RLMM framework for RL
MM framework for RL
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduce
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
 
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
Wireless Localization: Positioning
Wireless Localization: PositioningWireless Localization: Positioning
Wireless Localization: Positioning
 
Kk3517971799
Kk3517971799Kk3517971799
Kk3517971799
 
Second order traffic flow models on networks
Second order traffic flow models on networksSecond order traffic flow models on networks
Second order traffic flow models on networks
 

Mehr von Kim Hammar

Automated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jecturesAutomated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jecturesKim Hammar
 
Självlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTHSjälvlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTHKim Hammar
 
Learning Automated Intrusion Response
Learning Automated Intrusion ResponseLearning Automated Intrusion Response
Learning Automated Intrusion ResponseKim Hammar
 
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlIntrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlKim Hammar
 
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Kim Hammar
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Kim Hammar
 
Learning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via DecompositionLearning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via DecompositionKim Hammar
 
Digital Twins for Security Automation
Digital Twins for Security AutomationDigital Twins for Security Automation
Digital Twins for Security AutomationKim Hammar
 
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Kim Hammar
 
Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.Kim Hammar
 
Intrusion Response through Optimal Stopping
Intrusion Response through Optimal StoppingIntrusion Response through Optimal Stopping
Intrusion Response through Optimal StoppingKim Hammar
 
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...Kim Hammar
 
Self-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber DefenseSelf-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber DefenseKim Hammar
 
Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.Kim Hammar
 
Learning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal StoppingLearning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal StoppingKim Hammar
 
Intrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal StoppingIntrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal StoppingKim Hammar
 
Intrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-PlayIntrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-PlayKim Hammar
 
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.Kim Hammar
 
Intrusion Prevention through Optimal Stopping.
Intrusion Prevention through Optimal Stopping.Intrusion Prevention through Optimal Stopping.
Intrusion Prevention through Optimal Stopping.Kim Hammar
 

Mehr von Kim Hammar (20)

Automated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jecturesAutomated Security Response through Online Learning with Adaptive Con jectures
Automated Security Response through Online Learning with Adaptive Con jectures
 
Självlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTHSjälvlärande System för Cybersäkerhet. KTH
Självlärande System för Cybersäkerhet. KTH
 
Learning Automated Intrusion Response
Learning Automated Intrusion ResponseLearning Automated Intrusion Response
Learning Automated Intrusion Response
 
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback ControlIntrusion Tolerance for Networked Systems through Two-level Feedback Control
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
 
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
 
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
 
Learning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via DecompositionLearning Optimal Intrusion Responses via Decomposition
Learning Optimal Intrusion Responses via Decomposition
 
Digital Twins for Security Automation
Digital Twins for Security AutomationDigital Twins for Security Automation
Digital Twins for Security Automation
 
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
 
Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.Självlärande system för cyberförsvar.
Självlärande system för cyberförsvar.
 
Intrusion Response through Optimal Stopping
Intrusion Response through Optimal StoppingIntrusion Response through Optimal Stopping
Intrusion Response through Optimal Stopping
 
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
 
Self-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber DefenseSelf-Learning Systems for Cyber Defense
Self-Learning Systems for Cyber Defense
 
Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.Self-learning Intrusion Prevention Systems.
Self-learning Intrusion Prevention Systems.
 
Learning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal StoppingLearning Security Strategies through Game Play and Optimal Stopping
Learning Security Strategies through Game Play and Optimal Stopping
 
Intrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal StoppingIntrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal Stopping
 
Intrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-PlayIntrusion Prevention through Optimal Stopping and Self-Play
Intrusion Prevention through Optimal Stopping and Self-Play
 
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
 
Intrusion Prevention through Optimal Stopping.
Intrusion Prevention through Optimal Stopping.Intrusion Prevention through Optimal Stopping.
Intrusion Prevention through Optimal Stopping.
 

Kürzlich hochgeladen

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 

Kürzlich hochgeladen (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 

Cdis hammar stadler_15_oct_2020

  • 1. Self-learning Systems for Cyber Security1 Kim Hammar & Rolf Stadler kimham@kth.se & stadler@kth.se Division of Network and Systems Engineering KTH Royal Institute of Technology October 15, 2020 1 Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and Self-Play”. In: International Conference on Network and Service Management (CNSM 2020) (CNSM 2020). Izmir, Turkey, Nov. 2020. Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 1 / 12
  • 2. Game Learning Programs Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 2 / 12
  • 3. Challenges: Evolving and Automated Attacks Challenges: Evolving & automated attacks Complex infrastructures 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 4. Goal: Automation and Learning Challenges Evolving & automated attacks Complex infrastructures Our Goal: Automate security tasks Adapt to changing attack methods Security Agent Controller Observations Actions 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 5. Approach: Game Model & Reinforcement Learning Challenges: Evolving & automated attacks Complex infrastructures Our Goal: Automate security tasks Adapt to changing attack methods Our Approach: Model network as Markov Game MG = hS, A1, . . . , AN , T , R1, . . . RN i Compute policies π for MG Incorporate π in self-learning systems Security Agent Controller st+1, rt+1 at π(a|s; θ) 172.18.1.0/24 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 3 / 12
  • 6. Related Work Game-Learning Programs: TD-Gammon2 , AlphaGo Zero3 , OpenAI Five etc. =⇒ Impressive empirical results of RL and self-play Network Security: Automated threat modeling4 , automated intrusion detection etc. =⇒ Need for automation and better security tooling Game Theory: Network Security: A Decision and Game-Theoretic Approach5 . =⇒ Many security operations involves strategic decision making 2 Gerald Tesauro. “TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play”. In: Neural Comput. 6.2 (Mar. 1994), 215–219. ISSN: 0899-7667. DOI: 10.1162/neco.1994.6.2.215. URL: https://doi.org/10.1162/neco.1994.6.2.215. 3 David Silver et al. “Mastering the game of Go without human knowledge”. In: Nature 550 (Oct. 2017), pp. 354–. URL: http://dx.doi.org/10.1038/nature24270. 4 Pontus Johnson, Robert Lagerström, and Mathias Ekstedt. “A Meta Language for Threat Modeling and Attack Simulations”. In: Proceedings of the 13th International Conference on Availability, Reliability and Security. ARES 2018. Hamburg, Germany: Association for Computing Machinery, 2018. ISBN: 9781450364485. DOI: 10.1145/3230833.3232799. URL: https://doi.org/10.1145/3230833.3232799. 5 Tansu Alpcan and Tamer Basar. Network Security: A Decision and Game-Theoretic Approach. 1st. USA: Cambridge University Press, 2010. ISBN: 0521119324. Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 4 / 12
  • 7. Outline Use Case Markov Game Model for Intrusion Prevention Reinforcement Learning Problem Method Results Conclusions Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 5 / 12
  • 8. Use Case: Intrusion Prevention A Defender owns a network infrastructure Consists of connected components Components run network services Defends by monitoring and patching An Attacker seeks to intrude on the infrastructure Has a partial view of the infrastructure Wants to compromise a specific component Attacks by reconnaissance and exploitation Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 6 / 12
  • 9. Markov Game Model for Intrusion Prevention (1) Network Infrastructure Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 10. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 11. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 12. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 13. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 (5) Game Dynamics T , R1, R2, ρ0 4/0 0/1 8/0 ?/0 ?/0 ?/0 0/0 9/0 8/0 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 14. Markov Game Model for Intrusion Prevention (1) Network Infrastructure (2) Graph G = hN, Ei Ndata Nstart (3) State space |S| = (w + 1)|N|·m·(m+1) SA 3 = [0, 0, 0, 0] SD 3 = [9, 1, 5, 8, 1] SA 0 = [0, 0, 0, 0] SD 0 = [0, 0, 0, 0, 0] SA 1 = [0, 0, 0, 0] SD 1 = [1, 9, 9, 8, 1] SA 2 = [0, 0, 0, 0] SD 2 = [9, 9, 9, 8, 1] (4) Action space |A| = |N| · (m + 1) ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 ?/0 (5) Game Dynamics T , R1, R2, ρ0 4/0 0/1 8/0 ?/0 ?/0 ?/0 0/0 9/0 8/0 ∴ - Markov game - Zero-sum - 2 players - Partially observed - Stochastic elements - Round-based MG = hS, A1, A2, T , R1, R2, γ, ρ0i Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 7 / 12
  • 15. Automatic Learning of Security Strategies 1 2 2 2 q1 = 3 q2 = 4 q3 = 6 (18,18) (15,20) (9,18) q2 = 3 q2 = 4 q2 = 6 (20,15) (15,16) (8,12) q2 = 3 q2 = 4 q2 = 6 (18,9) (12,8) (0,0) q2 = 3 q2 = 4 q2 = 6 Finding strategies for the Markov game model: Evolutionary methods Computational game theory Self-Play Reinforcement learning Attacker vs Defender Strategies evolve without human intervention Motivation for Reinforcement Learning: Strong empirical results in related work Can adapt to new attack methods and threats Can be used for complex domains that are hard to model exactly Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 8 / 12
  • 16. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 17. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 18. The Reinforcement Learning Problem Goal: Approximate π∗ = arg maxπ E hPT t=0 γt rt+1 i Learning Algorithm: Represent π by πθ Define objective J(θ) = Eo∼ρπθ ,a∼πθ [R] Maximize J(θ) by stochastic gradient ascent with gradient ∇θJ(θ) = Eo∼ρπθ ,a∼πθ [∇θ log πθ(a|o)Aπθ (o, a)] Domain-Specific Challenges: Partial observability: captured in the model Large state space |S| = (w + 1)|N|·m·(m+1) Large action space |A| = |N| · (m + 1) Non-stationary Environment due to presence of adversary Agent Environment at st+1 rt+1 Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 9 / 12
  • 19. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 20. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 21. Our Reinforcement Learning Method Policy Gradient & Function Approximation To deal with large state space S πθ parameterized by weights θ ∈ Rd of NN. PPO & REINFORCE (stochastic π) Auto-Regressive Policy Representation To deal with large action space A To minimize interference π(a, n|o) = π(a|n, o) · π(n|o) Opponent Pool To avoid overfitting Want agent to learn a general strategy Attacker π(a|s; θ2) Defender π(a|s; θ1) st+1 rt+1 ha1 t , a2 t i Opponent pool s0 s1 s2 . . . Markov Game Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 10 / 12
  • 22. Experimentation: Learning from Zero Knowledge 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability AttackerAgent vs DefendMinimal, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 3 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.3 0.4 0.5 0.6 0.7 Attacker win probability DefenderAgent vs AttackMaximal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability DefenderAgent vs AttackMaximal, scenario 2 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability DefenderAgent vs AttackMaximal, scenario 3 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 3 REINFORCE PPO-AR PPO Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
  • 23. Experimentation: Learning from Zero Knowledge 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability AttackerAgent vs DefendMinimal, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 1.0 Attacker win probability AttackerAgent vs DefendMinimal, scenario 3 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.3 0.4 0.5 0.6 0.7 Attacker win probability DefenderAgent vs AttackMaximal, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability DefenderAgent vs AttackMaximal, scenario 2 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 0.8 Attacker win probability DefenderAgent vs AttackMaximal, scenario 3 REINFORCE PPO AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 1 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 2 REINFORCE PPO-AR PPO 0 2000 4000 6000 8000 10000 # Iteration 0.0 0.2 0.4 0.6 Attacker win probability Attacker vs Defender self-play training, scenario 3 REINFORCE PPO-AR PPO Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 11 / 12
  • 24. Conclusion & Future Work Conclusions: We have proposed a Method to automatically find security strategies =⇒ Model as Markov game & evolve strategies using self-play reinforcement learning Addressed domain-specific challenges with Auto-regressive policy, opponent pool, and function approximation. Challenges of applied reinforcement learning Stable convergence remains a challenge Sample-efficiency is a problem Generalization is a challenge Current & Future Work: Study techniques for mitigation of identified RL challenges Learn security strategies by interacion with a cyber range Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12
  • 25. Thank you All code for reproducing the results is open source: https://github.com/Limmen/gym-idsgame Kim Hammar & Rolf Stadler (KTH) CDIS Workshop October 15, 2020 12 / 12