SERENE 2014 - 6th International Workshop on Software Engineering for Resilient Systems
http://serene.disim.univaq.it/
Session 4: Monitoring
Paper 3: Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems
Seismic Method Estimate velocity from seismic data.pptx
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"
1. 1. Quanopt Ltd.
Combined Error Propagation Analysis
and Runtime Event Detection in
Process-driven Systems
Gábor Urbanics, László Gönczy, Balázs Urbán,
János Hartwig, Imre Kocsis
2. 2. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
3. 3. Quanopt Ltd.
Motivation
Analyse complex IT system
oDuring development
oDuring integration
oAt runtime
oBased on system models
Generate analysis for huge systems
Extendable
4. 4. Quanopt Ltd.
Process modelling
Business process:
oDirectly executed models (e.g. BPMN)
In a complex systems there are many
supporting resources
oWe present a method for business process and
supporting resources together
oOnly general tools:
• Markov chains, Event trees
• Too general, modelling could be hard
oDevelopment tools
• Basic performance analysis
• Business activity monitoring
5. 5. Quanopt Ltd.
Contributions
Multi aspect modelling of complex (IT) systems
oCustom, general process and resource model
Qualitative error propagation analysis
oRoot cause and sensitivity analysis
oUsing finite domain constraint satisfaction problem
Runtime process monitoring
6. 6. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
8. 8. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
9. 9. Quanopt Ltd.
Motivational example
Design time analysis capabilities
oSPOF analysis
oProcess-level effects of resource faults
oPropagating resource errors to the resource layer
10. 10. Quanopt Ltd.
Case study
Large
transaction?
Receipt
N
Y
N
N
Y
Y
Client
Business Processes Layer
Flag & report
Laundering
suspected?
Record
transaction
Money
takeover
Form
processing
Pay
to $
Manual
laundering check
Perform full
check
Timeout
Client checked
earlier?
Legend
Activity Execution Path
11. 11. Quanopt Ltd.
Process with resources
Large
transaction?
Receipt
N
Y
Backend Server 3
Compliance DB
AppServ4
N
N
Y
Y
AppServ3 VM
Customer & Account Identification
AppServ1 AppServ2
DB1 DB2
Backend Server 1 Backend Server 2
Application Server
cluster
Client
Business Processes Layer
Supporting
Applications Layer
Physical
Resources Layer
Flag & report
Laundering
suspected?
Record
transaction
Money
takeover
Form
processing
Pay
to $
Manual
laundering check
Perform full
check
Timeout
DB
Client checked
earlier?
Cashier Module
Single
Hypervisor
Blade Server
Legend
Activity
Resource
Dependency
Execution Path
12. 12. Quanopt Ltd.
Large
transaction?
Receipt
N
Y
Backend Server 3
Compliance DB
AppServ4
N
N
Y
Y
AppServ3 VM
Customer & Account Identification
AppServ1 AppServ2
DB1 DB2
Backend Server 1 Backend Server 2
Application Server
cluster
Client
Business Processes Layer
Supporting
Applications Layer
Physical
Resources Layer
Flag & report
Laundering
suspected?
Record
transaction
Money
takeover
Form
processing
Pay
to $
Manual
laundering check
Perform full
check
Timeout
DB
Client checked
earlier?
Cashier Module
Outage1
Outage1
Stuck1
Single Fault1
Outage1
Stuck1
Single
Hypervisor
Blade Server
Legend
Outage1
Resource Setup Identifier
Failure Mode
Use Case Id
Activity
Resource
Dependency
Execution Path
Single fault in physical layer
13. 13. Quanopt Ltd.
Large
transaction?
Receipt
N
Y
Backend Server 3
Compliance DB
AppServ4
N
N
Y
Y
AppServ3 VM
Customer & Account Identification
AppServ1 AppServ2
DB1 DB2
Backend Server 1 Backend Server 2
Application Server
cluster
Client
Business Processes Layer
Supporting
Applications Layer
Physical
Resources Layer
Flag & report
Laundering
suspected?
Record
transaction
Money
takeover
Form
processing
Pay
to $
Virtualized
HA Cluster
Manual
laundering check
Perform full
check
Timeout
Blade
Server Farm
DB
Client checked
earlier?
Cashier Module
Degraded2
Degraded2
Failover2
Single Fault2
Delay-incurred Cost2
Delayed2
Delayed
Delay-incurred Cost2
2
Legend
Outage1
Resource Setup Identifier
Failure Mode
Use Case Id
Activity
Resource
Dependency
Execution Path
Effects of a single fault
14. 14. Quanopt Ltd.
Backwards error propagation
Large
transaction?
Receipt
N
Y
Backend Server 3
Compliance DB
AppServ4
N
N
Y
Y
AppServ3 VM
Customer & Account Identification
AppServ1 AppServ2
DB1 DB2
Backend Server 1 Backend Server 2
Application Server
cluster
Client
Business Processes Layer
Supporting
Applications Layer
Physical
Resources Layer
Flag & report
Laundering
suspected?
Record
transaction
Money
takeover
Form
processing
Pay
to $
Virtualized
HA Cluster
Manual
laundering check
Perform full
check
Timeout
Blade
Server Farm
DB
Client checked
earlier?
Cashier Module
SQLInjected3
OK3
OK3
OK3
SQLInjected3
SQLInjected3
Legend
Outage1
Resource Setup Identifier
Failure Mode
Use Case Id
Activity
Resource
Dependency
Execution Path
15. 15. Quanopt Ltd.
Motivational example
Design time analysis capabilities
oSPOF analysis
oProcess-level effects of resource faults
oPropagating process errors to the resource layer
16. 16. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
17. 17. Quanopt Ltd.
Design time analysis
Error propagation rules
oThrough the process’ execution path
oThrough dependencies
Translate model to constraint satisfaction
problem (CSP)
Solution of the CSP provide the results
oOf root cause analysis
oSensitivity analysis Process model
Resource
model
Annotation
model
System
model Error Propagation Analysis
Monitoring
18. 18. Quanopt Ltd.
What is CSP?
Constraint satisfaction problem
oProblems defined mathematically
• A set of variables
• Constraints between them
A general solver can find the solution
oA single or a list of variable layouts
oAll constraints satisfied
19. 19. Quanopt Ltd.
Business Processes Layer
Form processingCustomer login
Legend
Activity Execution Path
Sample mapping to CSP
(Customer_login_run)
(Form_processing_run)
20. 20. Quanopt Ltd.
Sample mapping to CSP
(Customer_login_delay & Customer_login_run)
(Form_processing_delay)
Business Processes Layer
Form processingCustomer login
Legend
Activity Execution Path
21. 21. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
22. 22. Quanopt Ltd.
Runtime process monitoring
Runtime monitoring based on the same model
Rule based online event processing
oEvents captured during the execution
oEach time a rule satisfied
• Notification can be recorded
• Update of rule-specific process metrics
Coverage checks
Annotation-based
rule synthesis
Process model
Resource
model
Annotation
model
System
model Error Propagation Analysis
Monitoring
23. 23. Quanopt Ltd.
Architecture of the prototype
•Process Model
•Resource Model
•Fault model
•Process Execution Log
•Diagnostic Rules
•Propagation Rules
•Tagging •Dependability bottleneck
•Process hotspots
•Runtime diagnostic metrics
•Runtime alerts
24. 24. Quanopt Ltd.
Motivation and our contributions
Approach
Motivational example
Design time analysis
Runtime analysis
Future work and conclusion
25. 25. Quanopt Ltd.
Future work
System model and fault model „libraries”
Hierarchical modelling
Hierarchical/Incremental CSP evaluation
Uncertain failure modes
Back annotation of monitoring results
oQualitative abstraction
Precise modelling frontend
Connection with optimisation methods
26. 26. Quanopt Ltd.
Conclusion
Design time analysis of business processes
oWith the use of a resource model
oRoot cause analysis
oDetermine weak points
Rule based runtime diagnostic
oProcess monitoring based on event processing
oRule synthesis
oCoverage test