SlideShare ist ein Scribd-Unternehmen logo
1 von 18
“Policy-Based Reinforcement
Learning for Time Series
Anomaly Detection“
by Mengran Yu, Shiliang Sun
Presented by Kishor Datta Gupta
Problem
• Anomaly Detection — is the
identification of rare items,
events, or patterns that
significantly differ from the
majority of the data
Problem..
• time series is a sequence of
numerical data points in
successive order.
Problem..
• What is normal?
• How to measure deviation?
• What is higher deviation?
• It can be considered as a Markov Decision Process
problem because the decision of normal and abnormal
pattern at current time step will change the
environment which will affect the next decision.
Application: intrusion detection, credit card fraud, and
medical diagnoses
How we
detect
anomaly
Statistical Methods
Deviations from association rules and frequent item
sets
One-class Support Vector Machine
Clustering-based techniques (k-means)
Density-based techniques (k-nearest neighbor, local
outlier factor)
Autoencoders and replicators (Neural Networks)
Policy-
based time
series
anomaly
detector
(PTAD)
Based on the asynchronous actor-critic
algorithm.
The optimal stochastic policy acquired from
PTAD can adjust and distinguish between
normal and abnormal behaviors of the same
or different source and target datasets.
The behavior policy is Ɛ-greedy.
PTAD
formulation
• The state includes two parts:
• Sequence of previous actions Sa = (at-
m+1,at-m+2….at)
• Current time series St = (St-m+1, St-m+2, St)
• The state space S is infinite due to real time
series with a variety of alterations.
State:
• A = [0,1]
• 0: normal behavior
• 1: anomaly detection
Action:
PTAD
formulation.
Reward:
• R(s,a) = A if action is TP
• R(s,a) = B if action is FP
• R(s,a) = C if action is FN
• R(s,a) = D if action is TN.
• It utilizes a confusion matrix where positive means
anomaly and negative means normal behavior. The
values of A-D can be altered according to demand.
PTAD
formulation..
Policy:
• Deterministic policy:
• Value-based detector which gives action to be
taken under current state
• Stochastic policy:
• Provides the probability of each action under the
present state. The criterion of determining an
action can be changed.
PTAD
formulation…
• Environment:
• Time series repository containing large population
of labeld time series data. The environment can
generate specific states for training the agent and
check action quality.
The agent can simulate how the anomaly detector will
operate and does optimization.
• Input: Current time stamps and previous decision
• Output: New decision for next time stamp.
Asynchronous actor-
critic algorithm (A3C)
• Having multiple agents rather than one
single agent with its own network
parameters and copy of the
environment..
• Each agent is controlled by a global
network and contributes to the overall
learning.
• The actor-critic combines the value-
iteration method and policy-gradient
methods, it predicts both the value
function as well as the optimal policy
function.
• The learning agent uses the value of the
value function (critic) to update the
optimal policy function (actor).
•
PTAD construction:
• A3C algorithm is used to construct
PTAD which can decrease
correlations between successive
examples.
• N independent environments
containing whole labeled time series
data with inconsistent ranking. Each
environment provide time stamps of
distinct time series as states and
changes itself after an action has
been taken.
• PTAD has a global network and n
local networks. N env = n local
network. All local networks have actor-
critic framework
• Every agent has a different initial
environment to improve the anomaly
detection performance as agents
learn from different situations. These
avoids overfitting abnormal patterns.
The global network accumulates the
gradients from workers and optimizes
the policy.
PTAD Components:
Experiment :The PTAD
is trained with a multi-
core CPU of 8 threads
without the GPU. The
local network delivers
the gradients to the
global network every 5
steps and the learning
rates of actor network
and critic network are
0.001 and 0.0001,
respectively. The total
number of training
episodes is 20000. The
parameters in reward
function is set as A = C
= 5, B = D = 1.
Experimental results
Advantage
PTAD achieves the best performance not only on the same but also
on different source and target datasets.
it has a stochastic policy which slightly improves the detection
performance and can explore the tradeoff between the precision
and the recall for meeting practical requirements.
My thoughts
Using RNN confusion matrix to
calculate RL reward function is
interesting.
They didn’t compare their
result with Autoencoder based
techniques.
Questions ?

Weitere ähnliche Inhalte

Was ist angesagt?

07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation MaximizationAndres Mendez-Vazquez
 
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidad
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidadGuĂ­a adicional del tema 4 anĂĄlisis de sensibilidad
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidadSistemadeEstudiosMed
 
Probabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and SequencesProbabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and SequencesZitao Liu
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniquesgmidhubala
 
IntroducciĂłn a la ProgramaciĂłn No Lineal
IntroducciĂłn a la ProgramaciĂłn No LinealIntroducciĂłn a la ProgramaciĂłn No Lineal
IntroducciĂłn a la ProgramaciĂłn No LinealAngelCarrasquel3
 
Apache Storm: InstalaciĂłn
Apache Storm: InstalaciĂłnApache Storm: InstalaciĂłn
Apache Storm: InstalaciĂłnStratebi
 
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridad
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridadConsistencia es un tĂŠrmino mĂĄs amplio que el de integridad
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridadAngel Sanchez Virgen
 
Grafoscuestionario
GrafoscuestionarioGrafoscuestionario
GrafoscuestionarioUTCH
 
Image compression using singular value decomposition
Image compression using singular value decompositionImage compression using singular value decomposition
Image compression using singular value decompositionPRADEEP Cheekatla
 
Casos especiales del metodo simplex
Casos especiales del metodo simplexCasos especiales del metodo simplex
Casos especiales del metodo simplexbkike
 

Was ist angesagt? (14)

07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization07 Machine Learning - Expectation Maximization
07 Machine Learning - Expectation Maximization
 
Clases en java
Clases en javaClases en java
Clases en java
 
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidad
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidadGuĂ­a adicional del tema 4 anĂĄlisis de sensibilidad
GuĂ­a adicional del tema 4 anĂĄlisis de sensibilidad
 
Image compression
Image compressionImage compression
Image compression
 
Probabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and SequencesProbabilistic Models of Time Series and Sequences
Probabilistic Models of Time Series and Sequences
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniques
 
IntroducciĂłn a la ProgramaciĂłn No Lineal
IntroducciĂłn a la ProgramaciĂłn No LinealIntroducciĂłn a la ProgramaciĂłn No Lineal
IntroducciĂłn a la ProgramaciĂłn No Lineal
 
Apache Storm: InstalaciĂłn
Apache Storm: InstalaciĂłnApache Storm: InstalaciĂłn
Apache Storm: InstalaciĂłn
 
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridad
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridadConsistencia es un tĂŠrmino mĂĄs amplio que el de integridad
Consistencia es un tĂŠrmino mĂĄs amplio que el de integridad
 
Hilos en java
Hilos en javaHilos en java
Hilos en java
 
Grafoscuestionario
GrafoscuestionarioGrafoscuestionario
Grafoscuestionario
 
1.3 errores (1)
1.3 errores (1)1.3 errores (1)
1.3 errores (1)
 
Image compression using singular value decomposition
Image compression using singular value decompositionImage compression using singular value decomposition
Image compression using singular value decomposition
 
Casos especiales del metodo simplex
Casos especiales del metodo simplexCasos especiales del metodo simplex
Casos especiales del metodo simplex
 

Ähnlich wie Policy Based reinforcement Learning for time series Anomaly detection

malware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectmalware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectNaveenAd4
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...Sunghoon Joo
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsVladislavKashansky
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Marco Yandun
 
Presentazione Tesi Laurea Triennale in Informatica
Presentazione Tesi Laurea Triennale in InformaticaPresentazione Tesi Laurea Triennale in Informatica
Presentazione Tesi Laurea Triennale in InformaticaLuca Marignati
 
Network Intrusion Detection System Using Machine Learning and Deep Learning F...
Network Intrusion Detection System Using Machine Learning and Deep Learning F...Network Intrusion Detection System Using Machine Learning and Deep Learning F...
Network Intrusion Detection System Using Machine Learning and Deep Learning F...Leaving A Legacy
 
Deep Reinforcement learning
Deep Reinforcement learningDeep Reinforcement learning
Deep Reinforcement learningCairo University
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfSaketBansal9
 
GIS_presentation .pptx
GIS_presentation                    .pptxGIS_presentation                    .pptx
GIS_presentation .pptxlahelex741
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEM
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEMTRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEM
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEMGururaj B Rawoor
 
Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsMohammad Jafar Mashhadi
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...gabrielesisinna
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017JEE HYUN PARK
 

Ähnlich wie Policy Based reinforcement Learning for time series Anomaly detection (20)

malware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectmalware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year project
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
 
rerngvit_phd_seminar
rerngvit_phd_seminarrerngvit_phd_seminar
rerngvit_phd_seminar
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595
 
Presentazione Tesi Laurea Triennale in Informatica
Presentazione Tesi Laurea Triennale in InformaticaPresentazione Tesi Laurea Triennale in Informatica
Presentazione Tesi Laurea Triennale in Informatica
 
Network Intrusion Detection System Using Machine Learning and Deep Learning F...
Network Intrusion Detection System Using Machine Learning and Deep Learning F...Network Intrusion Detection System Using Machine Learning and Deep Learning F...
Network Intrusion Detection System Using Machine Learning and Deep Learning F...
 
IDS for IoT.pptx
IDS for IoT.pptxIDS for IoT.pptx
IDS for IoT.pptx
 
Deep Reinforcement learning
Deep Reinforcement learningDeep Reinforcement learning
Deep Reinforcement learning
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
 
GIS_presentation .pptx
GIS_presentation                    .pptxGIS_presentation                    .pptx
GIS_presentation .pptx
 
Data Cleaning Techniques
Data Cleaning TechniquesData Cleaning Techniques
Data Cleaning Techniques
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEM
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEMTRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEM
TRANSFORMER DIAGNOSTICS BY AN EXPERT SYSTEM
 
Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software Systems
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 

Mehr von Kishor Datta Gupta

Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Kishor Datta Gupta
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataKishor Datta Gupta
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defenseKishor Datta Gupta
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Kishor Datta Gupta
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Kishor Datta Gupta
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Kishor Datta Gupta
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer securityKishor Datta Gupta
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...Kishor Datta Gupta
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digitKishor Datta Gupta
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...Kishor Datta Gupta
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Kishor Datta Gupta
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Kishor Datta Gupta
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image conceptKishor Datta Gupta
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)Kishor Datta Gupta
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Kishor Datta Gupta
 

Mehr von Kishor Datta Gupta (20)

GAN introduction.pptx
GAN introduction.pptxGAN introduction.pptx
GAN introduction.pptx
 
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable data
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
 
Zero shot learning
Zero shot learning Zero shot learning
Zero shot learning
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer security
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 
understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...understanding the pandemic through mining covid news using natural language p...
understanding the pandemic through mining covid news using natural language p...
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digit
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)
 
Clustering report
Clustering reportClustering report
Clustering report
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image concept
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
 

KĂźrzlich hochgeladen

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Christopher Logan Kennedy
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

KĂźrzlich hochgeladen (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Policy Based reinforcement Learning for time series Anomaly detection

  • 1. “Policy-Based Reinforcement Learning for Time Series Anomaly Detection“ by Mengran Yu, Shiliang Sun Presented by Kishor Datta Gupta
  • 2. Problem • Anomaly Detection — is the identification of rare items, events, or patterns that significantly differ from the majority of the data
  • 3. Problem.. • time series is a sequence of numerical data points in successive order.
  • 4. Problem.. • What is normal? • How to measure deviation? • What is higher deviation? • It can be considered as a Markov Decision Process problem because the decision of normal and abnormal pattern at current time step will change the environment which will affect the next decision. Application: intrusion detection, credit card fraud, and medical diagnoses
  • 5. How we detect anomaly Statistical Methods Deviations from association rules and frequent item sets One-class Support Vector Machine Clustering-based techniques (k-means) Density-based techniques (k-nearest neighbor, local outlier factor) Autoencoders and replicators (Neural Networks)
  • 6. Policy- based time series anomaly detector (PTAD) Based on the asynchronous actor-critic algorithm. The optimal stochastic policy acquired from PTAD can adjust and distinguish between normal and abnormal behaviors of the same or different source and target datasets. The behavior policy is Ɛ-greedy.
  • 7. PTAD formulation • The state includes two parts: • Sequence of previous actions Sa = (at- m+1,at-m+2….at) • Current time series St = (St-m+1, St-m+2, St) • The state space S is infinite due to real time series with a variety of alterations. State: • A = [0,1] • 0: normal behavior • 1: anomaly detection Action:
  • 8. PTAD formulation. Reward: • R(s,a) = A if action is TP • R(s,a) = B if action is FP • R(s,a) = C if action is FN • R(s,a) = D if action is TN. • It utilizes a confusion matrix where positive means anomaly and negative means normal behavior. The values of A-D can be altered according to demand.
  • 9. PTAD formulation.. Policy: • Deterministic policy: • Value-based detector which gives action to be taken under current state • Stochastic policy: • Provides the probability of each action under the present state. The criterion of determining an action can be changed.
  • 10. PTAD formulation… • Environment: • Time series repository containing large population of labeld time series data. The environment can generate specific states for training the agent and check action quality. The agent can simulate how the anomaly detector will operate and does optimization. • Input: Current time stamps and previous decision • Output: New decision for next time stamp.
  • 11. Asynchronous actor- critic algorithm (A3C) • Having multiple agents rather than one single agent with its own network parameters and copy of the environment.. • Each agent is controlled by a global network and contributes to the overall learning. • The actor-critic combines the value- iteration method and policy-gradient methods, it predicts both the value function as well as the optimal policy function. • The learning agent uses the value of the value function (critic) to update the optimal policy function (actor). •
  • 12. PTAD construction: • A3C algorithm is used to construct PTAD which can decrease correlations between successive examples. • N independent environments containing whole labeled time series data with inconsistent ranking. Each environment provide time stamps of distinct time series as states and changes itself after an action has been taken. • PTAD has a global network and n local networks. N env = n local network. All local networks have actor- critic framework • Every agent has a different initial environment to improve the anomaly detection performance as agents learn from different situations. These avoids overfitting abnormal patterns. The global network accumulates the gradients from workers and optimizes the policy.
  • 14. Experiment :The PTAD is trained with a multi- core CPU of 8 threads without the GPU. The local network delivers the gradients to the global network every 5 steps and the learning rates of actor network and critic network are 0.001 and 0.0001, respectively. The total number of training episodes is 20000. The parameters in reward function is set as A = C = 5, B = D = 1.
  • 16. Advantage PTAD achieves the best performance not only on the same but also on different source and target datasets. it has a stochastic policy which slightly improves the detection performance and can explore the tradeoff between the precision and the recall for meeting practical requirements.
  • 17. My thoughts Using RNN confusion matrix to calculate RL reward function is interesting. They didn’t compare their result with Autoencoder based techniques.