Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
HALT value FMS Reliability
1. f msreliabilit y.com http://www.fmsreliability.com/education/halt-value/
HALT Value
An Estimate of HALT Value
Its always necessary to estimate the value of specif ic reliability activities. It is needed to justif y the investment
required to accomplish the task. Prototypes, diagnostics equipment and environmental chambers are
expensive. The dif f iculty is an inability to know what will be f ound, prior to conducting the experiment. Not
doing the test means the certainty of not f inding anything. Yet, that is of ten not enough motivation to invest,
to learn something about the reliability perf ormance. The f ollowing scenario is just one situation, along with a
f ew ideas to help you estimate the value of investments in reliability work.
HALT and time to market
Consider the development of a new game controller. The product is high volume, with the majority of sales
expected immediately af ter product launch, during the holiday sales period. It’s a new design, there’s an
emphasis on time to market, the majority of product will be manuf actured prior to the start of sales, there are
no repairs and the controller is an enabling part of a larger system. The controller’s reliability goal is 98%
reliable over the f irst year of ownership when used as part of the game system.
HALT vs ALT Discussion
One of the basic questions f acing the team is, “Will the product meet the 98% reliability goal?” An ALT may help
answer this question, if we know which f ailure mechanism(s) will lead to f ailure during the f irst year [1]. This is
a new product without any f ield history. Other controllers designed f or this environment have experienced a
range of f ailure causes, but are of ten dominated by shock and vibration damage f rom dropping.
The risk analysis done by the design team suspects that drop damage would be the most signif icant
contributor to product f ailures. The new controller is dif f erent enough that using the f ield data is likely to not
apply. Also, it is unknown which specif ic element of the design would experience f ailure f irst or at all over one
year of use. Theref ore, understanding the most likely f ailure mechanisms that are to occur is important to
discover.
2. The initial project plan did not include HALT testing on the f irst set of prototypes, rather it would sample f rom
the second set of prototypes, 8 weeks later, just bef ore the transf er of the design to manuf acturing to
conduct design verif ication testing (DVT), including lif e testing. The drop testing portion of the DVT is
expected to take a week to accomplish.
The reliability engineer on this program recommends perf orming HALT on the f irst available prototypes. The
suggestion is to use high loads of random vibration and high shock loads in the HALT plan, to quickly assess
the design weakness related to product drop damage. The project manager requests more inf ormation on
timing, cost and benef its (value).
HALT Cost
There isn’t time to procure a HALT chamber within the development schedule; theref ore let’s collect quotes
f rom HALT labs to conduct the testing. Let’s assume a quote of $10k f or one round of testing [2]. Of course, if
there were HALT f acilities internally available this cost would be less.
Also consider that the cost of the prototypes is about 5 times more expensive than second round prototype
units. The f irst round of prototypes are a small run, specialized tooling, quick turn production, costing
approximately $1k f or each unit. Let’s request f ive units, at an increased cost of 5 times over later prototypes:
at an $800 price increase, or $4k.
Rounding out the expected costs of engineering support, testing equipment support, and f ailure analysis
support, we can estimate an additional cost of approximately $10k. Theref ore, the total cost to the program to
add HALT testing is approximately $24k.
One of the primary benef its of HALT is the potential to uncover new f ailure mechanisms in the design [3]. By
conducting the HALT on the f irst available prototypes, the design team increases the time available to resolve
design errors or make design improvements. Designers tend to design away f rom f ailures; HALT is a tool to
discover previously unknown (or unsuspected) f ailure mechanisms.
Let’s assume (f or purpose of this example) the design prior to any testing has a 25% chance of a f ailure
mechanism that will lead to an unacceptably high f irst year f ailure rate. In discussions with the program
manager, the team learns that they would delay the start of production if there were a 10% or higher expected
f ield f ailure rate. Moreover, the cost of the delay was estimated at $500k per day in lost sales. With an
assumed 30 days to design and implement an improvement to resolve a major reliability issue, the losses would
amount to $500k/day f or 30 days, or $15 million.
There is a good chance that the design is f ine and will meet the reliability objectives. Let’s assume 75% of the
time the design has an overall f ailure rate of less than 10% over the f irst year. 25% of the time, the underlying
design has at least one major f ailure mechanism that may be detected and resolved prior to the start of sales.
Also, consider that no testing program will uncover all f aults – yet let’s assume that only 10% of the time will
HALT and DVT not f ind a major (>10% f ailure rate) issue. Also, HALT may not f ind the issue while DVT does
detect the f ault, let’s say 50% of the time. And, let’s assume HALT f inds the f ault only 40% of the time. Note:
this low rate is pessimistic f or an estimate of the ability of a well-executed HALT and in my experience HALT is
much more ef f ective.
For the value calculation, 25% chance of an unacceptable f ailure rate exists in the design, times a 40% chance
of HALT f inding the issue, times the cost avoided by having time to solve the issue without a 30 day program
delay, results in an expected savings of 0.25 x 0.40 x $15m = $1.5 million.
3. HALT ROI
The ROI is the ratio of the expected return over the cost. $1.5 million divided by $24k: the ROI is over 60.
This is only part of the value as it only considered the detection of major issues, thus avoiding a schedule slip.
The HALT will also f ind less signif icant issues that wouldn’t have resulted in a schedule slip, yet the earlier
detection would reduce the cost of implementing design changes. Plus, HALT may have f ound unique f ailure
mechanisms beyond what the DVT would f ind, than leading to an incremental reduction in achieved f ield f ailure
rate.
see also Reliability, ALT and Derating Value articles.
1. Silverman, Mike. How Reliable Is Your Product? Cupertino, CA: Super Star Press, December, 2010, pg. 193.
2. Personal Communication with Mike Silverman, June 18th, 2011.
3. Hobbs, Gregg K. Accelerated Reliability Engineering : HALT and HASS. Chichester ; New York: Wiley, 2000,
pg. 43.