monitor spec, detector model
●
Trace collection: run missions, log detectors
●
NCC estimation:
- Vary #traces
- Vary independence assumptions
- Compare to BBC on full data
●
Metrics: accuracy, computation time
Evaluation of NCC estimates
66
Results: accuracy vs. #traces
Evaluation of NCC estimates
67
Results: accuracy vs. assumptions
Evaluation of NCC estimates
68
Conclusions
●
NCC provides accurate error rate estimates with modest trace data
●
Accuracy depends on independence assumptions
●
Computation time scales linearly with formula size
●
Mixed approach is promising for runtime verification
1. 1
Compositional Probabilistic Analysis
of Temporal Properties
over Stochastic Detectors
Ivan Ruchkin, Oleg Sokolsky, Jim Weimer,
Tushar Hedaoo, Insup Lee
PRECISE Center
Department of Computer and Information Science
University of Pennsylvania
The International Conference on Embedded Software (EMSOFT)
September 22, 2020
8. 8
Motivating property
“The UVV should eventually re-discover the
pipeline after avoiding an obstacle.”
Run-time monitor
Detectors
9. 9
Motivating property
“The UVV should eventually re-discover the
pipeline after avoiding an obstacle.”
Temporal relation
Run-time monitor
Detectors
10. 10
Motivating property
“The UVV should eventually re-discover the
pipeline after avoiding an obstacle.”
Temporal relation
Run-time monitor
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
Detectors
12. 12
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
13. 13
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
Pro: accurate
Con: expensive data,
effortful development
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
14. 14
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
Pro: accurate
Con: expensive data,
effortful development
Model-based
●
Specify a monitor
●
Model the system
●
Theoretically estimate
the error rate
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
15. 15
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
Pro: accurate
Con: expensive data,
effortful development
Model-based
●
Specify a monitor
●
Model the system
●
Theoretically estimate
the error rate
Pro: no data needed
Con: inaccurate, poor
scalability
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
16. 16
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
Pro: accurate
Con: expensive data,
effortful development
Model-based
●
Specify a monitor
●
Model the system
●
Theoretically estimate
the error rate
Pro: no data needed
Con: inaccurate, poor
scalability
Mixed
●
Specify a monitor
●
Sample & label
detector outputs
●
Estimate the error
rate using the spec
and detector data
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
17. 17
Motivating property
Data-driven
●
Develop a monitor
●
Sample & label
monitor outputs
●
Empirically estimate
its error rate
Pro: accurate
Con: expensive data,
effortful development
Model-based
●
Specify a monitor
●
Model the system
●
Theoretically estimate
the error rate
Pro: no data needed
Con: inaccurate, poor
scalability
Mixed
●
Specify a monitor
●
Sample & label
detector outputs
●
Estimate the error
rate using the spec
and detector data
Pro: cheap data,
accurate, scalable,
low effort
– What are the error rates of this property’s
monitor given uncertain, unreliable detectors?
21. 21
Side sonar input
Front sonar input Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
22. 22
Side sonar input
Front sonar input Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
Property monitor
Violation
23. 23
Side sonar input
Front sonar input Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
Error
Error
Property monitor
Error
Violation
24. 24
Side sonar input
Front sonar input Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
Error
Error
Property monitor
Error
Violation
P( ) = ?
25. 25
Side sonar input
Front sonar input Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
Obst. ground truth
Pipe ground truth
Error
Error
Property monitor
Error
Violation
P( ) = ?
26. 26
Obstacle det
Our detector model
Pipeline det
Detection
output
Detection
output
Error
Error
Property monitor
Obst. ground truth
Pipe ground truth
Error
Violation
P( ) = ?
34. 34
Three-valued LTL for detectors
● Syntax of LTL3d
Semantics is given by detector compositions
“The UUV should re-discover the pipeline
within d seconds after losing it”
Monitor for this property
35. 35
Three-valued LTL for detectors
● Syntax of LTL3d
●
Semantics is given by detector compositions
“The UUV should re-discover the pipeline
within d seconds after losing it”
Monitor for this property
36. 36
Three-valued LTL for detectors
● Syntax of LTL3d
●
Semantics is given by detector compositions
●
“The UUV should re-discover the pipeline
within d seconds after losing it”
Monitor for this property
37. 37
Three-valued LTL for detectors
● Syntax of LTL3d
●
Semantics is given by detector compositions
●
“The UUV should re-discover the pipeline
within d seconds after losing it”
●
Monitor for this property
63. 63
Evaluation of NCC estimates
●
How accurate are they?
●
How are they affected by the quantity of
trace data?
●
How are they affected by the independence
assumptions?
●
What are their computational costs?
64. 64
●
How accurate are they?
●
How are they affected by the quantity of trace
data?
●
How are they affected by the independence
assumptions?
– Sensitive to accurate assumptions
●
What are their computational costs?
– 0-10 seconds, depending on formula size
Evaluation of NCC estimates
65. 65
Evaluation setup
●
Setup: UUV Gazebo sim, randomized missions
●
Goal: estimate error rates of two monitors
●
73 missions (total 7.7 hrs of sim time), each:
– Two pairs of traces: (DO, GT) for pipe det & monitor
https://github.com/uuvsimulator/uuv_simulator
66. 66
Evaluation variables
●
Independent:
– Mission configuration
– Monitor formula; deadline d for pipe loss (sec)
– Detector/monitor traces, for NCC/BBC resp.
Dependent (error rate estimates):
ECC: the true value based on exact probabilities
NCC: our approach on cheap (detector) data
BBC: data-driven on expensive (monitor) data
67. 67
Evaluation variables
●
Independent:
– Mission configuration
– Monitor formula; deadline d for pipe loss (sec)
– Detector/monitor traces, for NCC/BBC resp.
●
Dependent (error rate estimates):
– ECC: the true value based on exact probabilities
– NCC: our approach on cheap (detector) data
– BBC: data-driven on expensive (monitor) data
68. 68
ECC estimate (true, exact value)
NCC estimate (ours, detector traces)
BBC estimate (baseline, monitor traces)
Interpretation: given enough data,
the estimates are close
69. 69
NCC estimate (ours, detector traces)
BBC estimate (baseline, monitor traces)
Interpretation: less data favors NCC;
more data favors BBC
70. 70
Summary
Logical composition of detectors with LTL3d
Probabilistic estimation of error rates
Rule-based computational assistant
Accuracy on par with data-driven estimates
While using cheaper data & scalable analysis
Preferred when little or no data available
Computational
assistant
Detector model
Monitor’s error rate
Detector data
Monitor spec
71. 71
Summary
● Logical composition of detectors with LTL3d
Probabilistic estimation of error rates
Rule-based computational assistant
Accuracy on par with data-driven estimates
While using cheaper data & scalable analysis
Preferred when little or no data available
Computational
assistant
Detector model
Monitor’s error rate
Detector data
Monitor spec
72. 72
Summary
● Logical composition of detectors with LTL3d
●
Probabilistic estimation of error rates
– Rule-based computational assistant
Accuracy on par with data-driven estimates
While using cheaper data & scalable analysis
Preferred when little or no data available
Computational
assistant
Detector model
Monitor’s error rate
Detector data
Monitor spec
73. 73
Summary
● Logical composition of detectors with LTL3d
●
Probabilistic estimation of error rates
– Rule-based computational assistant
●
Accuracy on par with data-driven estimates
– While using cheaper data & scalable analysis
Computational
assistant
Detector model
Monitor’s error rate
Detector data
Monitor spec
74. 74
Refences
●
The computational assistant and UUV case study data:
https://github.com/bisc/prob-comp-asst
●
The original paper:
https://dx.doi.org/10.1109/TCAD.2020.3012643
●
Supplementary materials:
https://www.researchgate.net/publication/342993188_Suppl
ementary_Materials_for_Compositional_Probabilistic_Analy
sis_of_Temporal_Properties_over_Stochastic_Detectors
76. 76
Our framework at glance
Inputs:
– Set of detectors with event/error probabilities
– Logical property over detectors
– Detector independence assumptions
– Labeled traces from detectors
Method: a modeling & analysis framework based on
– Logical composition model
– Algebraic probability calculations
– Axiomatic independence inference
– Probability estimation from data
Output: estimate of an error rate for the property monitor
78. 78
Detector model as variables
● Given: mutually exclusive H1 and H0
●
Atomic detector D is a pair (DO, GT) of r.v.s:
– Detection outcome DO ∈ {T, F, U}
– Ground truth GT ∈ {T, F}
●
A probability space with marginal events:
dot(D) = (T, *) gtt(D) = (*, T)
dof(D) = (F, *) gtf(D) = (*, F)
dou(D) = (U, *)
79. 79
Detector model as probability space
Given: mutually exclusive H1 and H0
Stochastic detector D – a triple (Ω, 𝓕, Pr):
– Ω: set of six possible outcomes for a pair of variables:
Detection Outcome (DO) ∈ {T, F, U}
Ground Truth (GT) ∈ {T, F}
– 𝓕: sigma-algebra of events over Ω (i.e., the powerset of Ω)
dot = (T, *), dof = (F, *), dou = (U, *)
gtt = (*, T), gtf = (*, F)
– Pr: 𝓕 → [0, 1], a probability measure over 𝓕
D T F U
T
dot dof dou
F
D T F U
T gtt
F gtf
D T F U
T o1
o2
o3
F o4
o5
o6
81. 81
Composition of detectors (general)
● Operand detectors D1, D2 … DN
– A value pair (DO, GT) for each
●
Composition operator op
– 3-value version op3
– 2-value version op2
●
Composite detector is (DO, GT) such that
– DO = op3(DO1, … DON )
– GT = op2(GT1, … GTN )
82. 82
Composition of detectors (ours)
● Operand detectors D1, D2 … DN
– A trace of value pairs (DO, GT) for each
●
LTL-like formula φ
– 3-value semantics [[[φ]]]
– 2-value semantics [[φ]]
●
Composite detector is (DO, GT) such that
– DO = [[[φ(DO1, … DON )]]]
– GT = [[φ(GT1, … GTN )]]
93. 93
Reasoning rules
● 𝓡log – tautologies of LTL3d
● 𝓡ev – tautologies of event predicates
● 𝓡prob – algebraic probability rules
● 𝓡indep – conditional independence rules
94. 94
Related work: combining
detectors
●
Ensembles and cross-validation
– Good performance, weak guarantees
●
Hypothesis tests
– Either too simple or too complex
●
Logics of probability (probability operator)
– No need for explicit statements about probability
●
Probabilistic logics (probability of formulas)
– No need for uncertain statements
95. 95
Related work: predicting error
rates
●
Model-free
– E.g., estimation from data
– Requires extensive data collection & labeling
– Difficulty in estimating rare cases
●
Model-based
– E.g., model checking probabilistic automata
– Requires comprehensive (~ inaccurate) modeling
– Limited scalability