Oplægget blev holdt ved et seminar i InfinIT-interessegruppen Processer & IT Nord den 5. marts 2014. Læs mere om interessegruppen her: http://infinit.dk/dk/interessegrupper/processer_og_it/processer_og_it.htm
Risk based QA af Michael Agerkvist Petersen, Radiometer Medical
1. 07/03/14
1
Risk
Based
QA
Michael
Agerkvist
Petersen
miap@post3.tele.dk
dk.linkedin.com/in/michaelagerkvist
Michael
Agerkvist
Petersen
• QA
@
Radiometer
Medical
• 18+
years
with
Medical
Devices
– HW
development
– SW
development
– Project
Management
– Process
Improvement
– QA
• Owner
of
MDCA
(Medical
Device
Compliance
Assistance)
–
Spare
Qme
job
J
2. 07/03/14
2
Risk
Based
QA
• Based
on
my
experience
from:
– Making
a
class
C
Medical
Device
at
Novo
Nordisk
– Working
with
Encrypted
Pin
Pads
(ATM
keyboards)
at
Cryptera
– Other
projects
• This
will
not
be
a
complete
introducQon
to
(safety)
risk
management
or
Risk
Based
TesQng.
Agenda
• IntroducQon
• Risk
Based
Quality
Assurance
– Regulatory
Risks
–
Process
Rigor
– Risk
Based
DocumentaQon
Rigor
– Risk
Based
TesQng
– ProacQve
QA
• Examples
• PiZalls
3. 07/03/14
3
IntroducQon
Some
definiQons
• Quality:
The
totality
of
features
and
characterisQcs
of
a
product
or
service
that
bears
its
ability
to
saQsfy
stated
or
implied
needs
[ISO]
– There
are
many
different
customers:
Users,
OrganizaQon,
Regulatory….
• QA:
Quality
Assurance
–
many
ways
to
implement
this
in
pracQce
– Ensuring
that
development
is
in
compliance
(with
defined
process)
– Doing
the
actual
tesQng
– “anything”
which
helps
ensuring
Quality
to
the
different
customers
4. 07/03/14
4
Soeware
today
• Today
soeware
controls
more
and
more
in
our
daily
life.
• Soeware
failures
may
negaQvely
affect
business,
human
health
or
even
human
life
• ResulQng
in
increasing
public
and
regulatory
demands
for
befer
products
and
even
more
Qme
pressure.
Regulatory
Compliance
• More
and
more
industries
gets
regulated
• Some
are
more
regulated
than
others
Accident
Injury
or
other
loss
Public
reacQon
New
laws
&
regulaQons
Regulated
Industry
Nuclear
Flight
Medical
Device
Military
Payment
systems
Public
Systems
• The
bar
is
raised
over
Qme
[Not
to
scale]
5. 07/03/14
5
TesQng
Paradox
• TesQng
is
a
structured
approach
to
reduce
the
number
of
defects
it
is
the
last
acQvity
before
release.
So
it
is
oeen
the
first
to
be
sacrificed
• TesQng
is
always
a
sample.
You
can
never
test
everything,
and
you
can
always
find
more
to
test.
• A
good
test
case
is
one
finding
a
defect
–
running
a
complete
test
suite
without
finding
a
single
defects
will
not
add
much
value.
• The
problem
with
most
systemaQc
test
methods,
like
black
box
methods
(equivalence
parQQoning,
boundary
value
analysis
,
cause-‐effect
graphing,
etc.),
is
that
they
generate
too
many
test
cases,
many
of
them
will
never
find
a
defect.
• However
–
in
a
regulated
world
it
adds
value
to
run
tests
which
does
not
find
defects
–
a
“clean
sheet”
is
needed
to
make
a
submission
without
too
many
quesQons
Risk
Based
Approach
• A
way
to
lessen
the
the
work
load.
Could
be
to
test
more
in:
– bad
areas
of
the
product.
– the
most
important
funcQonal
areas
and
product
properQes.
• And
tesQng
less
in
the
other
areas…
• But,
how
to:
– find
the
right
areas?
– PrioriQzing
other
QA
acQviQes?
6. 07/03/14
6
Risk
Based
Quality
Assurance
TradiQonal
QA
• Consist
of:
– Review
– Dynamic
Test
– StaQc
analysis
– Templates
– Source
Code
standards
– Traceability
analysis
– Checklists
– Etc
• They
are
typically:
– Time
consuming
– Passive
– Always
compromised
due
to
Schedule
pressure
7. 07/03/14
7
Balancing
QA
Too
li-le
• Safety
risk
• Poor
reliability
• AddiQonal
cost
• Project
delay
• Regulatory
failure
• Customer
complaints
Too
much
• AddiQonal
cost
• Project
delay
• Under
verificaQon
(in
the
more
important
areas)
• Increased
maintenance
Result
of
poor
QA
Liability
and
liQgaQon
Recalls
Loss
of
company
image
In-‐market
updates
ReducQon
in
performance
No
regulatory
approval
Project
delay
Death
Major
Injury
Minor
Injury
Security
violaQon
ReducQon
in
performance
Loss
of
data
Business
cost
Customer
cost
Severity
High
Low
8. 07/03/14
8
Risk
Based
QA
Approach
• Some
QA
acQviQes
will
sQll
be
Passive
– They
are
well
known
and
needed
• Some
QA
acQviQes
will
be
more
ProacQve
– We
know
we
are
going
to
release
with
defects
-‐
so
why
not
try
to
miQgate
the
impact
of
specific
types
of
defects
in
the
design?
• All
QA
acQviQes
will
be
prioriQzed
– Some
defects
or
lack
of
process/documentaQon
will
do
more
harm
than
others.
So
why
use
the
same
effort
and
acQviQes
on
every
feature
or
enQty?
• Use
the
risk
based
QA
approach
to
– Find
the
most
criQcal
defects
as
early
as
possible
at
the
lowest
effort/
cost
– Find
the
most
criQcal
classes
of
defects
and
miQgate
the
impact
of
them
in
the
design
– Balance
the
rigor
of
process,
documentaQon
and
design
Some
risk
definiQons
• Harm:
Physical
injury
or
damage
to
the
health
of
people,
or
damage
to
property
or
the
environment.
– From
project
delay
(financial
loss)
to
death
(e.g.
of
user)
• Hazard:
PotenQal
source
of
harm
• Probability:
Of
a
harm
to
occur.
– Could
correspond
to
the
frequency
of
funcQonality
usage
by
the
user.
• Severity:
Measure
of
the
possible
consequence
of
a
hazard
– Customer
cost:
Loss
of
data
-‐>
Security
violaQon
-‐>
Injury-‐
>Death
– Business:
Project
delay-‐>Recalls-‐>Liability&liQgaQon
• Risk
=
Severity
x
Probability
[IEC
14971]
9. 07/03/14
9
Probability
• Probability
of
failure
– Usage
frequency
(funcQons
used
several
Qmes
a
day
vs
once
in
lifeQme)
– For
Medical
Device
Soeware,
probability
for
SW
defects
=
100%
-‐
but
the
probability
of
Harm
needs
to
take
probabiliQes
from
the
enQre
chain
of
events
Probability
Levels
Probability
of
Harm
Probability
of
Harm
Descrip@on
Ra@ng DefiniQon
5 Frequent
Constantly
present
4 Probable
Have
been/
will
be
reported
but
no
more
than
once
per
month
3 Occasional
Have
been/
will
be
reported
but
no
more
than
once
per
year
2 Remote
Have
been/
will
be
reported
but
no
more
than
once
in
products
lifeQme
(~10
years)
1 Improbable Considered
unlikely
to
occur
This
is
not
science…
And
very
difficult
in
real
life…
So
do
spend
too
much
Qme
finding
the
right
value
10. 07/03/14
10
Severity
Levels
Severity
of
poten@al
Harm
Harm
descrip@on
# Term Descrip@on
5 Catastrophic Results
in
immediate
death
of
paQent
or
user
• Immediate
death
of
person
caused
by
Explosion,
Fire
or
Electrical
shock
4 Serious
Results
in
permanent
impairment
or
criQcal
injury
that
would
require
medical
or
surgical
intervenQon
to
preclude
irreversible
impairment
or
damage
• Incorrect
medical
treatment
of
paQent
due
to
paQent
data
mix-‐up
• Body
part
damage
(e.g.
eye
or
fingers)
3 Moderate
Results
in
temporary/
reversible
injury
or
temporary/reversible
impairment
requiring
professional
medical
or
surgical
intervenQon
• Incorrect
or
inadequate
medical
treatment
2 Minor
Results
in
temporary
injury
or
impairment
not
requiring
professional
medical
intervenQon
• Minor
incorrect
or
inadequate
medical
treatment
• Delayed
medical
treatment,
loss
of
sample/new
sample
required
or
no
result
• Equipment
damage
• Privacy
violaQon
(e.g.
data
leak)
1 Negligible Inconvenience
or
temporary
discomfort
• UnsaQsfied
user
• Loss
of
old
data
QuanQfying
Severity
&
Probability
• Amount
of
severity
should
happen
by
considering
the
different
viewpoints
of
the
system’s
stakeholders.
• The
probability
of
impact
can
only
happen
indirectly,
e.g.
by
evaluaQon
of
– frequent
of
use,
– quality
indicators
like
the
complexity
of
the
soeware
itself,
– the
quality
of
the
documentaQon
– etc.
11. 07/03/14
11
QuanQfying
Severity
&
Probability
• Do
not
spend
too
much
Qme
determining
the
exact
values
• The
relaQve
scoring
is
the
most
important
in
order
to
idenQfy
the
most
criQcal
parts.
• The
scoring
could
be
rather
informal
and
based
on
a
brainstorm
Note:
For
Medical
Device
Class
B
and
Class
C
it
is
expected
that
the
Risk
Analysis
is
more
elaborated
than
stated
here
Risk
levels
• Four
levels
(3
should
be
OK)
• Based
on
IEC
62304
safety
class
(A,B,C)
• ClassificaQon
used
for:
– Requirements
(i.e.
some
funcQons
are
more
criQcal,
e.g.
Risk
Control
Measures
than
others)
– Structural
elements
both
design
and
actual
implementaQon
(i.e.
elements
implemenQng
criQcal
requirements)
Least
criQcal
Most
criQcal
C2
B
A
C1
12. 07/03/14
12
Ploxng
the
risks
Severity
High
Low
Probability
High
Low
High
Risks
Medium
Risks
Low
Risks
C2
C1
B
A
Different
risk
perspecQves
For
a
Medical
Device:
• Safety:
Freedom
of
unacceptable
risks.
IdenQfying
and
miQgaQng
safety
risks
to
paQents
and
users.
• Effec@veness:
Fulfil
the
medical
claims,
delivering
value
to
the
paQent
and
users.
MeeQng
the
user’s
needs.
Correctness
of
the
product,
meeQng
its
specificaQons.
• Customer
sa@sfac@on:
Good
user
experience,
good
service,
ease
of
use,
free
of
defects,
reliable.
• Regulatory:
Being
in
compliance
by
meeQng
the
Regulatory
ExpectaQons
including:
Safety,
Efficacy
and
Customer
saQsfacQon.
• Project:
MeeQng
the
organisaQons
expectaQons,
including
Quality,
Safety,
EffecQveness
and
Regulatory,
but
also
Qmelines
and
other
tradiQonal
project
risk
related
stuff.
14. 07/03/14
14
Process
Rigor
Common
FDA
Warning
lefer
issues
– Lack
of
• test
specificaQons
and
test
results
• comprehensive,
up-‐to-‐date
specificaQons
(design
input)
– Inadequate
• fault
handling
and
stress
tesQng
• change
and
release
control
Regulatory
Risks
versus
ProducQvity
&
Predictability
• RegulaQons
and
standards
does
not
seek
benefits
in
producQvity
or
project
predictability
• But
they
don’t
preclude
producQvity
and
predictability
from
being
important.
• So
it
is
your
responsibility
and
interest
to
have
development
processes
focusing
on:
– ProducQvity
– Predictability
Without
sacrificing
Safety,
Customer
saQsfacQon,
EffecQveness
and
Regulatory
risks
15. 07/03/14
15
Regulatory
Risks
versus
ProducQvity
&
Predictability
• Risk
Driven
Approach:
– Regulatory
interpretaQon
–
focus
on
the
intenQon
– ConQnuously
process
improvement
within
the
intenQon
and
frame
of
the
Regulatory
expectaQons
– Align
process
with
the
different
risks
(e.g.
Safety,
EffecQveness,
and
Customer
saQsfacQon)
associated
-‐
focus
on
what
really
mafers
• “Too
much
will
always
be
too
much,
but
maybe
not
enough”
Regulatory
Risk:
When
not
in
compliance
• Compliance
is
not
created
by:
– using
a
checklist
– copying
from
the:
• standards
• regulaQons
• guidance's
• CreaQng
compliance
is
about
meeQng
the
“intent”
of
the
standards
–
not
just
following
“the
lefer
of
the
law”
16. 07/03/14
16
Regulatory
Risk
–
Don’t
climb
too
high
E.g.
Tools
validaQon
-‐
Balancing
between:
• what
is
required
and
• when
value
adding
stops
minimum
opQmum
Risk
Based
DocumentaQon
Rigor
17. 07/03/14
17
DocumentaQon
Rigor
• To
lifle
– Difficult
to
anchor
decisions
– High
regulatory
risk
– Project
delay
– Maintenance
is
difficult
• To
much
– AddiQonal
cost
– Less
Qme
for
development
(project
delay)
– Maintenance
is
difficult
– Risk
of
in-‐consistence
– Project
delay
Requirements
Rigor
Rigor of requirementsHigh
Low
Safety
UI
User
Satisfaction
Service
Risk
Low High
Efficacy
18. 07/03/14
18
Design
Rigor
• 62304
allows
soeware
to
be
decomposed
into
soeware
items
with
different
safety
classes
– if
they
are
segregated
and
segregaQon
raQonale
provided
– No
raQonale
=>
Safety
Class
is
the
same
for
all
Item/Units.
– type
of
segregaQon
could
vary
based
on
risk
and
other
factors
Design
Rigor
–
Item/Unit
Level
Higher
Risk
Level
requires,
more
detailed
and
rigorous:
• Architecture
descripQon
• Detailed
Design
If
the
enQre
SW
is
Safety
Class
C
Detailed
design
is
required,
but
it
is
sQll
possible
to
use
the
Risk
Based
Approach
and
adjust
on
details
and
rigor
Least
criQcal
Component
Most
criQcal
Component
UI
FuncQon
Model
Driver
HAL
SI
C2
B
C2
B
A
A
C1
C1
Allocate
Risk
Level
to
the
different
Items/Units
based
on
their
responsibility
19. 07/03/14
19
Risk
Based
TesQng
Risk
Based
TesQng
• IdenQfy
the
top
most
criQcal
funcQons
• Consider:
– Evaluate
whether
the
users
will
idenQfy
defects
in
funcQon
or
afribute.
– Use
historical
data
to
idenQfy
funcQon
areas
with
many
defects
• Do
extra
tesQng
in
criQcal
areas
and
areas
with
many
defects
– Use
domain
specialists
– Extend
(automated)
regression
test
when
new
defects
are
found
20. 07/03/14
20
For
System
TesQng
Risk
level
SW
System
test
ac@vi@es
C2,
C1
• FuncQonal
• Exploratory
• Consider
other
test
strategies
(Stress,
Boundary,
Stability,
State
transsion,
Recovery)
dependent
of
the
FuncQon
under
test
and
Risk
level
B
• FuncQonal
• Security
• Exploratory
A
• FuncQonal
(all
requirements
have
at
least
one
TC)
• Allocate
Risk
Level
to
the
different
FuncQons/
requirements
based
on
the
possible
severity
and
probability.
Item/Unit
Level
tesQng
Higher
Risk
Level
requires,
more
detailed
and
rigorous
QA
AcQviQes:
• Reviews,
• TesQng,
• Etc.
If
appropriate
use
historical
data
to
idenQfy
Item/Units
with
many
defects
in
order
to
adjust
the
Risk
Level
Least
criQcal
Component
Most
criQcal
Component
UI
FuncQon
Model
Driver
HAL
SI
C2
B
C2
B
A
A
C1
C1
21. 07/03/14
21
Item/Unit
Level
tesQng
Risk
level
Unit
Tes@ng
ac@vi@es
C2
• Formal
Code
Review
by
at
least
one
SW
Developer
plus
SW
Risk
Manager
• StaQc
Analysis
• Soeware
Unit
Test,
100%
decision/condiQon
coverage
• IntegraQon
test
using
decisions
tables
and
classificaQon
trees
C1
• Formal
Code
Review
by
at
least
one
SW
Developer
• StaQc
Analysis
• Soeware
Unit
Test,
100%
Statement
coverage
• IntegraQon
test
using
decision
tables
B
• Formal
Code
Review
by
at
least
one
SW
Developer
• StaQc
Analysis
• Soeware
Unit
Test,
100%
funcQon
coverage
• IntegraQon
test
A
• Informal
Peer
Code
Review
• IntegraQon
test
part
of
System
Level
Test
Note:
Risk
Level
A
not
to
be
used
for
Class
C
Soeware
Item/Unit
Level
tesQng
&
Complexity
Risk
level
Complexity
Reduce
C2
McCabe
<=
2
AND
LoC
<
10
Reduce
Unit
Test
to
100%
FuncQon
coverage
C1
McCabe
<=
3
AND
LoC
<
20
Reduce
Unit
Test
to
100%
FuncQon
coverage
B
McCabe
<=
3
AND
LoC
<
30
No
Unit
Test
necessary
(Not
for
Class
C
Soeware
A
NA
NA
• Complexity
–
root
cause
for
many
defects
–
but
low
complexity
code
may
also
require
less
tesQng.
• In
source
code
use
complexity
metrics
to
adjust
the
QA
acQviQes
• Note:
Complexity
metrics
also
part
of
the
code
standard
22. 07/03/14
22
ProacQve
QA
ProacQve
QA
• Uses
same
approach
as
for
Medical
Device
Safety
Risk
Management
(IEC
14971,
IEC
80002).
• IdenQfy
most
criQcal
SW
hazards
and
their
causes,
e.g.:
– Loss
of
configuraQon
could
make
the
SW/Device
useless
– Faulty
data
could
result
in
fault
funcQonality
and/or
results
– Never-‐ending
waiQng
loops
could
could
make
the
SW/
Device
slow
or
useless
• Implement
proper
miQgaQons
in
the
design
23. 07/03/14
23
Examples
Example
–
Keyboard
in
ATM
Keyboard
Dispenser
Display
Card
Reader
Ext
Keyboard
XFS
drv
XFS
drv
XFS
drv
XFS
Win
XP
Bank
App
PC
ATM
Master-‐key
derived-‐key
derived-‐key
derived-‐key
derived-‐key
• <1%
source
code
doing
keyboard
funcQonality
• Remaining
related
to:
• Security:
crypt,
key-‐
handling,
surveillance
• Service:
ConfiguraQon,
status
log
etc.
Reliability
wise
–
key
handling
is
very
important
• Without
the
Master
key
the
keyboard
needs
to
back
to
the
manufacturer
• Without
derived
keys
the
keyboard
needs
a
service
tech.
visit
24. 07/03/14
24
Risk
Based
QA
for
ATM
keyboard
• “Spontaneous”
loss
of
keys
in
the
field
• Risk
based
approach:
New
file
system
with
– CRC
Error
correcQon
– “Black
box
recorder”
(log
of
field
events
for
debugging)
– More
rigor
of
requirements
and
design
– Unit
tesQng
of
the
new
file
system
Risk
Based
QA
for
ATM
keyboard
• In
the
pilot
phase
-‐
several
incidences
where
keyboard
is
“completely
dead”
• In
certain
situaQons,
defects
in
both
the
CRC
Error
correcQon
and
“Black
box
recorder”
ends
up
in:
“logging
an
error
result
in
a
new
error…”
• Learnings:
– Adding
“miQgaQons”
in
soeware
increases
complexity
–
which
may
end
up
in
more
erroneous
SW
– Remember
integraQon
and
scenario
tesQng
– Consider
some
kind
of
recovery
mechanism
25. 07/03/14
25
Example
ProacQve
QA
for
VHF
radio
• VHF
Radio
stores
“vital”
data
in
EEPROM.
• During
SW
test
a
HW
design
flaw
is
found.
HW
do
not
give
“Power
Down”
in
Qme
=>
“vital”
data
is
corrupted
=>
VHF
Radio
is
useless.
• ProacQve
QA:
– CRC
protecQon
of
data
– Shadowing
of
data
– Controlled
Scheme
for
data
update
– The
approach
also
miQgates
SW
failures
Bafery
monitor
in
Medical
Device
• A
Bafery
powered
Medical
Device
have
a
bafery
monitor
to
inform
when
charging
is
needed
• ProacQve
QA
– Bafery
power
is
(also)
displayed
in
number
of
measurements
lee
– If
number
of
measurements
lee
<=
2
then
measurements
is
not
possible
– When
number
of
measurements
lee
<=2
then
it
is
always
decreased
with
1
independently
of
bafery
status
– Monitoring
algorithm
adapts
when
bafery
degrades
26. 07/03/14
26
ICU
Monitor
in
Demo
mode
• ICU
Monitor
have
a
demo
mode
to
show
realisQc
waveforms
in
sales
situaQon.
• Erroneously
a
ICU
Monitor
jumped
into
Demo
mode
during
monitoring
of
real
paQent
• ProacQve
QA
– Changing
waveform
to
non
realisQc
waveforms
– WriQng
“Demo”
where
waveforms
are
displayed
– Timeout
on
Demo
mode
–
jumping
back
to
real
mode
aeer
a
period
of
in-‐acQvity
PiZalls
27. 07/03/14
27
PiZalls
• Too
much
are
considered
criQcal
or
too
much
are
considered
not-‐criQcal
– Risk
of
not
finding
the
criQcal
defects
• Customer
and
Manufacturer
have
different
view
of
what
is
criQcal
– Risk
of
un-‐saQsfied
customer
• Management
only
buys
the
cost
reducQon
part
of
risk
based
QA
– Risk
of
poor
quality
PiZalls
• No
use
of
historical
data
–
also
within
the
project
– If
defect
trends
shows
different
than
your
risk
evaluaQon
(e.g.
un-‐idenQfied
criQcal
defects)
then
you
should
adapt
your
Risk
Based
Approach.
• Design
miQgaQons
adds
too
much
complexity
– resulQng
in
other
defects,
difficult
to
maintain
SW
• The
“system”
to
handle
Risk
Based
QA
are
too
complex.