5. Risk and the design process
• Each design step includes the refinement of the
risk analysis
• Each design solution has to be measured against
the risk analysis
• Constant design questions:
– Is the design balanced?
– Can it be made?
– Can it be done simpeler?
9. An example…
• Not extracting landing
gear when commanded
without error indication
• Spontaneous
irreversible landing gear
extraction while
travelling overseas
10. Top-down vs. Bottom-Up analysis
• Bottom-up: structured
brainstorm about
everything that could
happen given a specific
scope
• Top-down: think about
your biggest fears first,
than find out what could
cause it.
11. FME(C)A: bottom-up thinking
• Failure Mode and
Effect (Criticality)
Analysis
• Reasoning from
failure of the
components,
thinking about the
consequences
13. Guide words…
Look at every component and
investigate what happens if:
– It doesn’t work
– It is very slow
– Does the wrong thing
– Sends messages spontanously
– Loses messages/state
– Leaks information
14. Structured FMECA approach
Function Failure Mode Causes Local Effects System Effects Criticality
Inwin Wrong output Logical error Unjustified open No closure Catastrophic
Delayed output PLC error delayed closure Closure delayed Limited
No output Application hang No closure No closure Catastrophic
Spontanous output Switching error Unjustified open Onterechtesluit False Positive
Process Wrong output Logical error Unjustified open No closure Catastrophic
Delayed output PLC error delayed closure Closure delayed Limited
No output Application hang No closure No closure Catastrophic
Spontanous output Switching error Power failure No closure Catastrophic
… … … … … …
… … … … …
… … … … …
… … … … …
15. Certainty…
Rank beliefs not according to their
plausibility but by the harm they may
cause.
Nassim Nicholas Taleb
Slide 1515 June 2016
17. You can’t mitigate everything…
• You can’t prevent everything
• You can’t plan for everything
• You can’t predict everything
• You couldn’t do any business
• But, you can’t ignore
everything either
18. Structured FMECA approach
Function Failure Mode Causes Local Effects System Effects Criticality Detection
Mitigating
Measures
Inwin Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming
Delayed output PLC error delayed closure Closure delayed Limited None
No output Application hang No closure No closure Catastrophic None
Failsafe behaviour
Process
Spontanous output Switching error Unjustified open Onterechtesluit False Positive None
Process Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming
Delayed output PLC error delayed closure Closure delayed Limited None
No output Application hang No closure No closure Catastrophic None Deadlock detection
Spontanous output Switching error Power failure No closure Catastrophic None Safety relay
… … … … … … … …
… … … … … … …
… … … … … … …
… … … … … … …
New functional and
design requirements!
19. Disadvantages FME(C)A
• It is impossible to calculate an overall risk
exposure
• Relation between risks is missing
– Common mode failures usually aren’t modelled
• Complex scenario’s are hard to model
– Multiple failures aren’t modelled
– Are there root causes that could trigger multiple failures?
• Usually identifies irrelevant risks
20. Top-Down Risk analysis
• Start with a dominant
concern
• Identify potential
causes
• Detail further
25. Design decisions…
• Every design decision is accompanied by a Risk
analysis focussing on RAMS aspects
• In the end the cost, RAMS effects and other
trade-off aspects will determine which design
option will be used
33. Testing
Function Impact wrong/not
functioning
Impact spontanous
functioning
Function 1 Small Medium
Function 2 Disasterous Huge
Function 3 Serious Huge
Function 3 Serious Small
Function 4 Serious Serious
Function 5 Serious Small
Function 6 Huge Huge
…
34. Test depth and acceptable risk
• Level A: Thorough endurancetest aiming to
prove function reliability with high accuracy.
• Level B: Thorough endurancetest aiming to
prove function reliability with medium
accuracy.
• Level C: Thorough endurancetest aiming to
prove function reliability with low accuracy.
• Level D: Test to verify if the function works
once.
• Level E: Function testd alongside other
functions, might leave paths untested.
Test effort
Level #Tests Effort
Level A 50.000 120 hours
Level B 10.000 24 hours
Level C 1.000 4 hours
Level D 1 1 hour
Level E - PM
35. Test depth…
Functie Not functioning Spont. Function
Function 1 Level E NOT
Function 2 Level A Level A
Function 3 Level A Level A
Function 3 Level B Level B
Function 4 Level A Level A
Function 5 Level E NOT
Function 6 Level A Level A
… … …