SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Bayesian Autoencoders for anomaly detection in industrial environments :
Formulation & design, uncertainty quantification, and explainability
Bang Xiang Yong
outline
1. Introduction & Background
• What is anomaly detection
• What is Autoencoder
• Problems with Autoencoder
2. Contributions
• Probabilistic formulation and design of Bayesian Autoencoder
• Uncertainty quantification in anomaly detection
• Sensors explainability
3. Conclusion
introduction
1. Anomaly / Outlier / Novelty / Out of distribution (OOD) detection, One-class classification
• Abundance of “healthy” or inlier data
• Want to detect data arising from “non-healthy” / anomalous data
2. Easy to determine outliers in low dimension (e.g. multivariate Gaussian, ± 2 std. dev)
• But not so in high-dimensional data (e.g. hundreds / thousands of dimensions)
Each data point / snapshot in time contains :
• K sensors
• D measurements
Total = K x D features (e.g. 11 x 2000)
Blue = Healthy samples
Orange = Anomalies
D measurements
K sensors
Green = Healthy/Inliers/In-distribution
Red = Anomalies/OOD/
Inputs (measured during radial forging processes)
98 sensors x 5620 features
Outputs
Measurement dimensions of a forged part (Within/out of tolerance)
Inputs
11 Sensors x 2000 features
Outputs
Degradation (0-100%)
<25% as healthy
Advanced Forming Research Centre
(AFRC)- Radial Forge (£2.3m)
Inputs
17 sensors x (60-6000) features
Outputs
Fault diagnosis of subsystems
(accumulator, cooler, valve,
pump)
Quality prediction (STRATH) Condition Monitoring (ZEMA)
background
1. Neural network is an attractive option:
• Flexible and scalable
• Able to handle any data type (image, text, audio, tabular, graphs, etc)
• Rapid advancement in specialized hard- and software
2. Unsupervised learning is preferred over supervised learning
• Lack of labelled data
• Data imbalance
3. Autoencoder is an example of unsupervised neural network
• Require only data labelled as inliers for training
NVDA
What is an Autoencoder?
Encoder Decoder
Bottleneck layer
(Compression)
Process
Measurements Reconstructed Measurements
Training distribution Good reconstructed signal of
Training Distribution
Anomalous distribution
Poorly reconstructed signal
of Anomalous Distribution
Reconstruction error =
Several trust issues with AE in high-stake applications:
1. Lack of a sound theoretical ground for analysis
• Overreliance on analogies
• Why should anomalies have larger reconstruction error?
• Is there a framework which explains what is AE doing?
• Do we need a bottleneck?
2. Lack of uncertainty quantification
• Given a prediction, how uncertain is it ?
3. Lack of explainability
• Given a prediction, which sensors are relevant ?
Contribution 1: Formulation and design of BAE for anomaly detection
3. Sample from posterior over parameters due to intractability:
• Bayesian ensembling
• MC Dropout
• Bayes by backprop (variational inference)
• Markov Chain Monte Carlo (MCMC)
4. Output from training procedure:
• M posterior samples of AE parameters
1. Likelihood (reconstruction loss) :
• Isotropic Gaussian of variance 1 => Mean squared error
• Bernoulli likelihood => Binary cross entropy
(*recommended in Pytorch documentation for AE loss)
2. Prior (regulariser):
• Isotropic Gaussian distribution => L2 regularisation
Training Prediction
Bayes rule
Gaussian
Log-Likelihood
Predictive density of new data =
Posterior mean log-likelihood
Posterior variance of
log-likelihood
BAE is a parametric density estimation model
Evaluation
(1) AUROC > 0.5 better than random guessing
(2) GSS: Geometric mean of sensitivity-specificity
Blue = inliers
Orange = anomalies
Recent papers (e.g Nalisnick et al., “Do Deep Generative Models Know What They Don't Know?” ICLR 2019)
showed that likelihood of autoencoders may not work for OOD detection (anomaly detection).
Example : FashionMNIST (in-distribution) vs MNIST (anomaly).
B. X. Yong, T. Pearce and A. Brintrup, "Bayesian Autoencoders: Analysing and Fixing the Bernoulli likelihood for
Out-of-Distribution Detection," ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning.
2. Find this is due to confoundedness. Bernoulli
likelihood assign higher maximum values to
inputs close to 0 (MNIST).
3. Propose two fixes:
i) Uncertainty of log likelihood
ii) Use other likelihood (e.g Gaussian)
1. Surprising – since reconstructed outputs
look very different from inputs for
anomaly. Culprit in the likelihood choice?
Most work described the need for bottleneck in AEs analogically :
• Did not compare against AEs without bottleneck
does the AE need a bottleneck?
1. Autoencoders have the objective of dimensionality reduction and do not target anomaly detection directly.
2. The main difficulty of applying autoencoders for anomaly detection is given in choosing the right degree of
compression, i.e. dimensionality reduction.
3. If there was no compression, an autoencoder would just learn the identity function.
Lukas et al. “Deep One-Class Classification”. ICML. (2018)
“This (bottleneck) ensures that only useful features are learned by the autoencoder, instead of merely copying the input data for
reconstructing the output”.
Chow et al. “Anomaly detection of defects on concrete structures with the convolutional autoencoder”. Advanced Engineering
Informatics 45. (2020)
“ The identity function seems a particularly trivial function to be trying to learn; but by placing constraints on the
network..” – http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/
Encoder Decoder
Bottleneck layer
(undercomplete)
Two ways to eliminate bottleneck:
• Increase latent dimensions > input dimensions
• Introduce skip connections (ala U-Net)
Viewing AE as a density estimation model:
• Will it benefit from higher capacity & better architecture ?
stop strangling the AE !
Encoder Decoder
Wide
Latent layer
(overcomplete)
Skip connections
Visualisation on synthetic toy data
Brighter region =
higher likelihood
Takeaway:
• Identity function
was not learnt,
despite not
having
bottleneck
Encoder : 2-50-50-50-z , z=1 or z=100
Another visualization example :
• 1D density estimation
• Infinitely wide AE with Gaussian Process
(either NNGP or NTK method)
-
Takeaways:
1. AUROC >> 0.5 with/without bottleneck
• Identity function is not learnt !
2. Removing the bottleneck may improve performance
• Allows wider architecture search
application to condition monitoring & quality inspection
ZEMA tasks
STRATH tasks
Contribution 2: Uncertainty in anomaly detection
Although we have used BAE for predicting anomaly:
• Still lack way to express uncertainty as a form of predictive confidence
BAE
anomaly score (is it an inlier/anomaly?)
Input
vector
anomaly uncertainty (can we trust the prediction? / does the BAE know?)
What can we do with uncertainty?
1. Uncertainty as an indicator of predictive error -> when ground truth is not available
2. Filter away predictions of high uncertainty -> remain high quality and accurate predictions
3. Handle uncertain predictions as exceptions -> refer to further inspection (e.g. manually by human / equipment )
Evaluate as anomaly detection + reject option:
(1) classify as inlier / (2) classify as anomaly / (3) reject i.e. don’t know
Blue = inliers
Orange = anomalies
AUROC = 0.93 AUROC = 1.0
Density
Reject 40% of
predictions with
highest
uncertainties
Anomaly scores Anomaly scores
Proposed probabilistic workflow to quantify total anomaly uncertainty with BAE
Anomaly
uncertainty
Anomaly
probability
Visualisation on toy data
Result on ZEMA dataset (condition monitoring)
1. Uncertainty indicates
predictive error
2. Rejecting uncertain
predictions leads to higher
accuracy.
Takeaways:
• Not all uncertainties are the same !
• U-exceed prone to overconfidence
• Combining epistemic and aleatoric
uncertainties are better than having
either one of them.
Given that a BAE prediction of OOD , can we tell which sensors are important?
1. Formulation of sensor attribution methods for BAE.
• Don’t need post-hoc explanation models.
• Naturally available to BAE.
• Under independent likelihood assumption, we can decompose the BAE predictions
(both mean and variance) into importance scores for sensor inputs.
Contribution 3 : Explainable predictions for BAE
Bang Xiang Yong, Alexandra Brintrup. Coalitional Bayesian Autoencoders: Towards explainable
unsupervised deep learning. Submitted to Journal of Soft Applied Computing, 2021.
Example for condition monitoring
Contribution 3 : Explainable predictions for BAE
2. Development of "Coalitional BAE“ to improve explanation quality.
• Centralised BAE : One BAE for all sensors.
• Coalitional BAE : One BAE for each sensor.
(Enforces sensor independence in outputs)
Centralised BAE
Coalitional BAE
M samples of BAE
K sensors
D features for each sensor
Assume :
NLL scores
NLL scores
Contribution 3 : Explainable predictions for BAE
3. Finding of misleading explanations due to correlation in
Centralised BAE outputs.
• e.g falsely explain non-drifting sensors as covariate shifting / departing from training
distribution.
• Coalitional BAE’s explanation quality outperforms Centralised BAE.
Top Row:
Blue : explanation for drifting sensors
Orange : explanation for non-drifting
sensors
x-axis : machine degradation
Bottom Row:
Correlation between explanation
scores for drifting and non-drifting
sensors
baetorch: Python package for BAE
https://github.com/bangxiangyong/baetorch
Conclusion
Contributed improvements to trustworthiness in AE for anomaly detection:
1. Formulation & Design of BAE
• Choice of likelihood: avoid Bernoulli likelihood !
• Does AE need a bottleneck ? Stop strangling the AE !
2. Quantify uncertainty in prediction of anomaly
• Probabilistic workflow captures epistemic and aleatoric uncertainties
• Rejecting uncertain predictions leads to higher overall accuracy
• Need for high quality uncertainty estimation
3. Sensor explainability
• Due to correlation in output, AE are prone to misleading explanations
• Proposed “Coalitional BAE” as a fix
If you are working on anomaly detection/one class classification | interested to apply the latest developments of BAE | use baetorch
- Please reach out to me !

Weitere ähnliche Inhalte

Was ist angesagt?

Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
Abhineet Bhamra
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural network
Indira Nayak
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
Sumeet Kakani
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
Jeff Heaton
 

Was ist angesagt? (18)

Term11566
Term11566Term11566
Term11566
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
Anomaly detection Workshop slides
Anomaly detection Workshop slidesAnomaly detection Workshop slides
Anomaly detection Workshop slides
 
What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)What is pattern recognition (lecture 4 of 6)
What is pattern recognition (lecture 4 of 6)
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural network
 
Developing an Artificial Immune Model for Cash Fraud Detection
Developing an Artificial Immune Model for Cash Fraud Detection   Developing an Artificial Immune Model for Cash Fraud Detection
Developing an Artificial Immune Model for Cash Fraud Detection
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Face Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural NetworksFace Recognition Methods based on Convolutional Neural Networks
Face Recognition Methods based on Convolutional Neural Networks
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
 
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
 
Navy security contest-bigdataforsecurity
Navy security contest-bigdataforsecurityNavy security contest-bigdataforsecurity
Navy security contest-bigdataforsecurity
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Resume
Resume Resume
Resume
 

Ähnlich wie Bayesian Autoencoders for anomaly detection in industrial environments

Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 
Itab innovative assessement tool
Itab innovative assessement toolItab innovative assessement tool
Itab innovative assessement tool
Martin J Ippel
 
Itab innovative assessments
Itab innovative assessmentsItab innovative assessments
Itab innovative assessments
Martin J Ippel
 
Sarcia idoese08
Sarcia idoese08Sarcia idoese08
Sarcia idoese08
asarcia
 

Ähnlich wie Bayesian Autoencoders for anomaly detection in industrial environments (20)

PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
 
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
Mx net image segmentation to predict and diagnose the cardiac diseases   karp...Mx net image segmentation to predict and diagnose the cardiac diseases   karp...
Mx net image segmentation to predict and diagnose the cardiac diseases karp...
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
SENSOR FAULT IDENTIFICATION IN COMPLEX SYSTEMS | J4RV3I12007
SENSOR FAULT IDENTIFICATION IN COMPLEX SYSTEMS | J4RV3I12007SENSOR FAULT IDENTIFICATION IN COMPLEX SYSTEMS | J4RV3I12007
SENSOR FAULT IDENTIFICATION IN COMPLEX SYSTEMS | J4RV3I12007
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
 
146056297 cc-modul
146056297 cc-modul146056297 cc-modul
146056297 cc-modul
 
Design Verification and Test Vector Minimization Using Heuristic Method of a ...
Design Verification and Test Vector Minimization Using Heuristic Method of a ...Design Verification and Test Vector Minimization Using Heuristic Method of a ...
Design Verification and Test Vector Minimization Using Heuristic Method of a ...
 
Data Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolData Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification Tool
 
Abdominal Bleeding Region Detection in Wireless Capsule Endoscopy Videos Usin...
Abdominal Bleeding Region Detection in Wireless Capsule Endoscopy Videos Usin...Abdominal Bleeding Region Detection in Wireless Capsule Endoscopy Videos Usin...
Abdominal Bleeding Region Detection in Wireless Capsule Endoscopy Videos Usin...
 
Wavelet Based on the Finding of Hard and Soft Faults in Analog and Digital Si...
Wavelet Based on the Finding of Hard and Soft Faults in Analog and Digital Si...Wavelet Based on the Finding of Hard and Soft Faults in Analog and Digital Si...
Wavelet Based on the Finding of Hard and Soft Faults in Analog and Digital Si...
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
Mathematics of anomalies
Mathematics of anomaliesMathematics of anomalies
Mathematics of anomalies
 
Itab innovative assessement tool
Itab innovative assessement toolItab innovative assessement tool
Itab innovative assessement tool
 
MTV15
MTV15MTV15
MTV15
 
Itab innovative assessments
Itab innovative assessmentsItab innovative assessments
Itab innovative assessments
 
Sarcia idoese08
Sarcia idoese08Sarcia idoese08
Sarcia idoese08
 
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
 

Mehr von Bang Xiang Yong

Mehr von Bang Xiang Yong (6)

Proposal for Linking Concept Drift and uncertainty of Machine learning
Proposal for Linking Concept Drift and uncertainty of Machine learningProposal for Linking Concept Drift and uncertainty of Machine learning
Proposal for Linking Concept Drift and uncertainty of Machine learning
 
PhD First Year Conference (MAY 2019)
PhD First Year Conference (MAY 2019)PhD First Year Conference (MAY 2019)
PhD First Year Conference (MAY 2019)
 
First Year Report, PhD presentation
First Year Report, PhD presentationFirst Year Report, PhD presentation
First Year Report, PhD presentation
 
SOHOMA19 Conference slides
SOHOMA19 Conference slidesSOHOMA19 Conference slides
SOHOMA19 Conference slides
 
Use cases - agentMET4FOF
Use cases - agentMET4FOFUse cases - agentMET4FOF
Use cases - agentMET4FOF
 
Espriex 2017 slides
Espriex 2017 slidesEspriex 2017 slides
Espriex 2017 slides
 

Kürzlich hochgeladen

如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Kürzlich hochgeladen (20)

Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

Bayesian Autoencoders for anomaly detection in industrial environments

  • 1. Bayesian Autoencoders for anomaly detection in industrial environments : Formulation & design, uncertainty quantification, and explainability Bang Xiang Yong
  • 2. outline 1. Introduction & Background • What is anomaly detection • What is Autoencoder • Problems with Autoencoder 2. Contributions • Probabilistic formulation and design of Bayesian Autoencoder • Uncertainty quantification in anomaly detection • Sensors explainability 3. Conclusion
  • 3. introduction 1. Anomaly / Outlier / Novelty / Out of distribution (OOD) detection, One-class classification • Abundance of “healthy” or inlier data • Want to detect data arising from “non-healthy” / anomalous data 2. Easy to determine outliers in low dimension (e.g. multivariate Gaussian, ± 2 std. dev) • But not so in high-dimensional data (e.g. hundreds / thousands of dimensions) Each data point / snapshot in time contains : • K sensors • D measurements Total = K x D features (e.g. 11 x 2000) Blue = Healthy samples Orange = Anomalies D measurements K sensors Green = Healthy/Inliers/In-distribution Red = Anomalies/OOD/
  • 4. Inputs (measured during radial forging processes) 98 sensors x 5620 features Outputs Measurement dimensions of a forged part (Within/out of tolerance) Inputs 11 Sensors x 2000 features Outputs Degradation (0-100%) <25% as healthy Advanced Forming Research Centre (AFRC)- Radial Forge (£2.3m) Inputs 17 sensors x (60-6000) features Outputs Fault diagnosis of subsystems (accumulator, cooler, valve, pump) Quality prediction (STRATH) Condition Monitoring (ZEMA)
  • 5. background 1. Neural network is an attractive option: • Flexible and scalable • Able to handle any data type (image, text, audio, tabular, graphs, etc) • Rapid advancement in specialized hard- and software 2. Unsupervised learning is preferred over supervised learning • Lack of labelled data • Data imbalance 3. Autoencoder is an example of unsupervised neural network • Require only data labelled as inliers for training NVDA
  • 6. What is an Autoencoder? Encoder Decoder Bottleneck layer (Compression) Process Measurements Reconstructed Measurements Training distribution Good reconstructed signal of Training Distribution Anomalous distribution Poorly reconstructed signal of Anomalous Distribution Reconstruction error =
  • 7. Several trust issues with AE in high-stake applications: 1. Lack of a sound theoretical ground for analysis • Overreliance on analogies • Why should anomalies have larger reconstruction error? • Is there a framework which explains what is AE doing? • Do we need a bottleneck? 2. Lack of uncertainty quantification • Given a prediction, how uncertain is it ? 3. Lack of explainability • Given a prediction, which sensors are relevant ?
  • 8. Contribution 1: Formulation and design of BAE for anomaly detection 3. Sample from posterior over parameters due to intractability: • Bayesian ensembling • MC Dropout • Bayes by backprop (variational inference) • Markov Chain Monte Carlo (MCMC) 4. Output from training procedure: • M posterior samples of AE parameters 1. Likelihood (reconstruction loss) : • Isotropic Gaussian of variance 1 => Mean squared error • Bernoulli likelihood => Binary cross entropy (*recommended in Pytorch documentation for AE loss) 2. Prior (regulariser): • Isotropic Gaussian distribution => L2 regularisation Training Prediction Bayes rule Gaussian Log-Likelihood Predictive density of new data = Posterior mean log-likelihood Posterior variance of log-likelihood BAE is a parametric density estimation model Evaluation (1) AUROC > 0.5 better than random guessing (2) GSS: Geometric mean of sensitivity-specificity Blue = inliers Orange = anomalies
  • 9. Recent papers (e.g Nalisnick et al., “Do Deep Generative Models Know What They Don't Know?” ICLR 2019) showed that likelihood of autoencoders may not work for OOD detection (anomaly detection). Example : FashionMNIST (in-distribution) vs MNIST (anomaly). B. X. Yong, T. Pearce and A. Brintrup, "Bayesian Autoencoders: Analysing and Fixing the Bernoulli likelihood for Out-of-Distribution Detection," ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning. 2. Find this is due to confoundedness. Bernoulli likelihood assign higher maximum values to inputs close to 0 (MNIST). 3. Propose two fixes: i) Uncertainty of log likelihood ii) Use other likelihood (e.g Gaussian) 1. Surprising – since reconstructed outputs look very different from inputs for anomaly. Culprit in the likelihood choice?
  • 10. Most work described the need for bottleneck in AEs analogically : • Did not compare against AEs without bottleneck does the AE need a bottleneck? 1. Autoencoders have the objective of dimensionality reduction and do not target anomaly detection directly. 2. The main difficulty of applying autoencoders for anomaly detection is given in choosing the right degree of compression, i.e. dimensionality reduction. 3. If there was no compression, an autoencoder would just learn the identity function. Lukas et al. “Deep One-Class Classification”. ICML. (2018) “This (bottleneck) ensures that only useful features are learned by the autoencoder, instead of merely copying the input data for reconstructing the output”. Chow et al. “Anomaly detection of defects on concrete structures with the convolutional autoencoder”. Advanced Engineering Informatics 45. (2020) “ The identity function seems a particularly trivial function to be trying to learn; but by placing constraints on the network..” – http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/
  • 11. Encoder Decoder Bottleneck layer (undercomplete) Two ways to eliminate bottleneck: • Increase latent dimensions > input dimensions • Introduce skip connections (ala U-Net) Viewing AE as a density estimation model: • Will it benefit from higher capacity & better architecture ? stop strangling the AE ! Encoder Decoder Wide Latent layer (overcomplete) Skip connections
  • 12. Visualisation on synthetic toy data Brighter region = higher likelihood Takeaway: • Identity function was not learnt, despite not having bottleneck Encoder : 2-50-50-50-z , z=1 or z=100
  • 13. Another visualization example : • 1D density estimation • Infinitely wide AE with Gaussian Process (either NNGP or NTK method) -
  • 14. Takeaways: 1. AUROC >> 0.5 with/without bottleneck • Identity function is not learnt ! 2. Removing the bottleneck may improve performance • Allows wider architecture search application to condition monitoring & quality inspection ZEMA tasks STRATH tasks
  • 15. Contribution 2: Uncertainty in anomaly detection Although we have used BAE for predicting anomaly: • Still lack way to express uncertainty as a form of predictive confidence BAE anomaly score (is it an inlier/anomaly?) Input vector anomaly uncertainty (can we trust the prediction? / does the BAE know?) What can we do with uncertainty? 1. Uncertainty as an indicator of predictive error -> when ground truth is not available 2. Filter away predictions of high uncertainty -> remain high quality and accurate predictions 3. Handle uncertain predictions as exceptions -> refer to further inspection (e.g. manually by human / equipment ) Evaluate as anomaly detection + reject option: (1) classify as inlier / (2) classify as anomaly / (3) reject i.e. don’t know
  • 16. Blue = inliers Orange = anomalies AUROC = 0.93 AUROC = 1.0 Density Reject 40% of predictions with highest uncertainties Anomaly scores Anomaly scores
  • 17. Proposed probabilistic workflow to quantify total anomaly uncertainty with BAE
  • 19. Result on ZEMA dataset (condition monitoring) 1. Uncertainty indicates predictive error 2. Rejecting uncertain predictions leads to higher accuracy.
  • 20. Takeaways: • Not all uncertainties are the same ! • U-exceed prone to overconfidence • Combining epistemic and aleatoric uncertainties are better than having either one of them.
  • 21. Given that a BAE prediction of OOD , can we tell which sensors are important? 1. Formulation of sensor attribution methods for BAE. • Don’t need post-hoc explanation models. • Naturally available to BAE. • Under independent likelihood assumption, we can decompose the BAE predictions (both mean and variance) into importance scores for sensor inputs. Contribution 3 : Explainable predictions for BAE Bang Xiang Yong, Alexandra Brintrup. Coalitional Bayesian Autoencoders: Towards explainable unsupervised deep learning. Submitted to Journal of Soft Applied Computing, 2021. Example for condition monitoring
  • 22. Contribution 3 : Explainable predictions for BAE 2. Development of "Coalitional BAE“ to improve explanation quality. • Centralised BAE : One BAE for all sensors. • Coalitional BAE : One BAE for each sensor. (Enforces sensor independence in outputs) Centralised BAE Coalitional BAE M samples of BAE K sensors D features for each sensor Assume : NLL scores NLL scores
  • 23. Contribution 3 : Explainable predictions for BAE 3. Finding of misleading explanations due to correlation in Centralised BAE outputs. • e.g falsely explain non-drifting sensors as covariate shifting / departing from training distribution. • Coalitional BAE’s explanation quality outperforms Centralised BAE. Top Row: Blue : explanation for drifting sensors Orange : explanation for non-drifting sensors x-axis : machine degradation Bottom Row: Correlation between explanation scores for drifting and non-drifting sensors
  • 24. baetorch: Python package for BAE https://github.com/bangxiangyong/baetorch
  • 25. Conclusion Contributed improvements to trustworthiness in AE for anomaly detection: 1. Formulation & Design of BAE • Choice of likelihood: avoid Bernoulli likelihood ! • Does AE need a bottleneck ? Stop strangling the AE ! 2. Quantify uncertainty in prediction of anomaly • Probabilistic workflow captures epistemic and aleatoric uncertainties • Rejecting uncertain predictions leads to higher overall accuracy • Need for high quality uncertainty estimation 3. Sensor explainability • Due to correlation in output, AE are prone to misleading explanations • Proposed “Coalitional BAE” as a fix If you are working on anomaly detection/one class classification | interested to apply the latest developments of BAE | use baetorch - Please reach out to me !

Hinweis der Redaktion

  1. Include the symbols for prediction
  2. Smoothness in introducing the first diagram Add animation.
  3. Diagrams explanation : Could improve the smoothness…. Independent likelihood assumption : could put some symbols ? Ref paper
  4. Diagrams explanation : Could improve the smoothness…. Independent likelihood assumption : could put some symbols ? Ref paper
  5. Diagrams explanation : Could improve the smoothness…. Independent likelihood assumption : could put some symbols ? Ref paper
  6. Diagrams explanation : Could improve the smoothness…. Independent likelihood assumption : could put some symbols ? Ref paper