2016-05-30 risk driven design

Risk Driven Development
J.vanEkris@Delta-Pi.nl

Reliability
Availability
Maintainability
Safety

IEC 61508: Required activities for safety related systems

Risk and the design process
• Each design step includes the refinement of the
risk analysis
• Each design solution has to be measured against
the risk analysis
• Constant design questions:
– Is the design balanced?
– Can it be made?
– Can it be done simpeler?

Simplicity is
prerequisite for
reliability
Edsger W. Dijkstra

Risk management process
Slide 715 June 2016

Failure definitions
• What can go wrong
exactly?
• When do we
consider the system
to be failed?

An example…
• Not extracting landing
gear when commanded
without error indication
• Spontaneous
irreversible landing gear
extraction while
travelling overseas

Top-down vs. Bottom-Up analysis
• Bottom-up: structured
brainstorm about
everything that could
happen given a specific
scope
• Top-down: think about
your biggest fears first,
than find out what could
cause it.

FME(C)A: bottom-up thinking
• Failure Mode and
Effect (Criticality)
Analysis
• Reasoning from
failure of the
components,
thinking about the
consequences

Risk: System does not perform trick?

Guide words…
Look at every component and
investigate what happens if:
– It doesn’t work
– It is very slow
– Does the wrong thing
– Sends messages spontanously
– Loses messages/state
– Leaks information

Structured FMECA approach
Function Failure Mode Causes Local Effects System Effects Criticality
Inwin Wrong output Logical error Unjustified open No closure Catastrophic
Delayed output PLC error delayed closure Closure delayed Limited
No output Application hang No closure No closure Catastrophic
Spontanous output Switching error Unjustified open Onterechtesluit False Positive
Process Wrong output Logical error Unjustified open No closure Catastrophic
Delayed output PLC error delayed closure Closure delayed Limited
No output Application hang No closure No closure Catastrophic
Spontanous output Switching error Power failure No closure Catastrophic
… … … … … …
… … … … …
… … … … …
… … … … …

Certainty…
Rank beliefs not according to their
plausibility but by the harm they may
cause.
Nassim Nicholas Taleb
Slide 1515 June 2016

Identifying measures
• Risk = Chance * Impact
• Moments allowing measures:
– Preventive
– Detection
– Repression
– Correction
– Ignore
– Accept
Slide 1615 June 2016

You can’t mitigate everything…
• You can’t prevent everything
• You can’t plan for everything
• You can’t predict everything
• You couldn’t do any business
• But, you can’t ignore
everything either

Structured FMECA approach
Function Failure Mode Causes Local Effects System Effects Criticality Detection
Mitigating
Measures
Inwin Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming
Delayed output PLC error delayed closure Closure delayed Limited None
No output Application hang No closure No closure Catastrophic None
Failsafe behaviour
Process
Spontanous output Switching error Unjustified open Onterechtesluit False Positive None
Process Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming
Delayed output PLC error delayed closure Closure delayed Limited None
No output Application hang No closure No closure Catastrophic None Deadlock detection
Spontanous output Switching error Power failure No closure Catastrophic None Safety relay
… … … … … … … …
… … … … … … …
… … … … … … …
… … … … … … …
New functional and
design requirements!

Disadvantages FME(C)A
• It is impossible to calculate an overall risk
exposure
• Relation between risks is missing
– Common mode failures usually aren’t modelled
• Complex scenario’s are hard to model
– Multiple failures aren’t modelled
– Are there root causes that could trigger multiple failures?
• Usually identifies irrelevant risks

Top-Down Risk analysis
• Start with a dominant
concern
• Identify potential
causes
• Detail further

Typical risks identified
• Components making the wrong decissions
• Power failure
• Hardware failure of PLC’s/Servers
• Software failures
• Network failure
• External factors
• Human maintenance error
22

Breaking a cut-set
Alternate component
Alternate service

Measures and FTA
15/06/2016
24
Before After

Design decisions…
• Every design decision is accompanied by a Risk
analysis focussing on RAMS aspects
• In the end the cost, RAMS effects and other
trade-off aspects will determine which design
option will be used

Info
Hoogtebepaling Aansturing
Hoogtemeting
Waterkering
Diesels
Meeta
Meetb
Stuura
Stuurb
Software failure
Chance: 1/1.000 year
Measurement error
Chance: (1/1.000.000 year)3
Software failure
Chance: 1/1.000.000 year
Software failure
Design Option 1

Info
Hoogtebepaling Aansturing
Hoogtemeting
Waterkering
Diesels
Meeta
Meetb
Stuura
Stuurb
Software failure
Measurement error
Chance: (1/1.000.000 year)3
Software failure
Chance: 1/100 year
Software failure
Design Option 2

Testing
Function Impact wrong/not
functioning
Impact spontanous
functioning
Function 1 Small Medium
Function 2 Disasterous Huge
Function 3 Serious Huge
Function 3 Serious Small
Function 4 Serious Serious
Function 5 Serious Small
Function 6 Huge Huge
…

Test depth and acceptable risk
• Level A: Thorough endurancetest aiming to
prove function reliability with high accuracy.
• Level B: Thorough endurancetest aiming to
prove function reliability with medium
accuracy.
• Level C: Thorough endurancetest aiming to
prove function reliability with low accuracy.
• Level D: Test to verify if the function works
once.
• Level E: Function testd alongside other
functions, might leave paths untested.
Test effort
Level #Tests Effort
Level A 50.000 120 hours
Level B 10.000 24 hours
Level C 1.000 4 hours
Level D 1 1 hour
Level E - PM

Test depth…
Functie Not functioning Spont. Function
Function 1 Level E NOT
Function 2 Level A Level A
Function 3 Level B Level B
Function 5 Level E NOT
… … …

2016-05-30 risk driven design

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (7)

Ähnlich wie 2016-05-30 risk driven design

Ähnlich wie 2016-05-30 risk driven design (20)

Mehr von Jaap van Ekris

Mehr von Jaap van Ekris (14)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

2016-05-30 risk driven design