Are you interested in significant-other reliability developments (SOD) that have not been adopted? Combined with adopted developments, they constitute real reliability, just like the product of a complex number and its complex conjugate yields a real number. SOD includes nonparametric estimates of age-specific field reliability and failure rate functions (actuarial rates), without life data. These estimates deal with renewal processes, repairable processes, and missing data. SOD also quantify uncertainty, not just sample uncertainty. Privacy protection is afforded by not tracking products or people by serial number or name to obtain ages at failures and survivors’ ages. SOD may help employ reliability people and induce, governments, companies, and consumers to make decisions and compare products based on real reliability and risk.
2. ASQ Reliability Division
English Webinar Series
One of the monthly webinars
on topics of interest to
reliability engineers.
To view recorded webinar (available to ASQ Reliability
Division members only) visit asq.org/reliability
To sign up for the free and available to anyone live
webinars visit reliabilitycalendar.org and select English
Webinars to find links to register for upcoming events
http://reliabilitycalendar.org/The_Reli
ability_Calendar/Webinars_‐
_English/Webinars_‐_English.html
3. Complex Conjugate History
of Reliability
• SORD*SOTA = Real Reliability
– SORD = Significant Other Reliability Developments
– SOTA = State Of The (reliability) Art
• Why?
– Profit, save our jobs, and protect privacy
– Do something about reliability, risk, and uncertainty!
• What s in the future? What s needed?
1/10/2011 Problem Solving Tools 1
4. What SORDs?
Risk is present when future events occur with measurable probability. Uncertainty is
present when the likelihood of future events is indefinite or incalculable. Frank Knight
• Nonparametric reliability and failure
rate functions for:
– Grouped, left-and-right-censored, and
truncated data
– Renewal and repairable processes
• Without life data
• Uncertainty: brooms, jackknives and
bootstraps, extrapolations, scenarios,…
1/10/2011 Problem Solving Tools 2
5. Examples
• Component D (Weibull vs. nonparametric)
• M88A1 drivetrain parts (Renewal process)
• LED L70 reliability (Black-Scholes)
• Pleasanton O-D matrix and travel times
(multivariate, network tomography)
1/10/2011 Problem Solving Tools 3
6. ANCIENT HISTORY
• Discrete failure rate functions, aka actuarial
rates ~220 AD
– Domitius Ulpianus: Roman Legion pension
planning, life table
– John Graunt 1600s life tables
– Edmond Halley ca 1693 annuities
• Insurance
– James Dodson, Equitable Life, casualty (1762)
– Gompertz' Curve (1825) death rate is
• a(t) = e t+ from a double exponential cdf (Weibull)
1/10/2011 Problem Solving Tools 4
7. Gambling and Physics
• Gambling: Pascal, Laplace, Bernoullis, John
Kelly, Ed Thorp, Dr. Z
• Utility, game, risk, credibility: Neumann,
Morgenstern, Nash, Harsanyi, Hilary Seal,
Bühlmann…
• Financial analysis, hedging, scenarios: Black-
Merton-Scholes, Shannon, Thorp, Ziemba
• Physics: Schrödinger wave function !: |!
(x;t)|2 is probability density: Myron Tribus
statistical thermodynamics, entropy, and
1/10/2011 Problem Solving Tools 5
reliability
8. Modern Times (outline)
• Modern histories
• Significant other reliability developments
– RAND and the US AFLC
– Barlow, Proschan, Marshall, Saunders, Block, et al.
– Lajos Takacs, Stephen Vajda
– Kaplan-Meier
– Sir David Cox
– Network tomography
1/10/2011 Problem Solving Tools 6
9. Modern Histories
• Barlow and Proschan reviewed reliability in
their first book (1965)
• Nowlan and Heap s RCM appendix D-1
contains more (1978)
• Recent publications about adopted
developments [McLinn, Saleh and Marais]
• Psychologists hijack the meaning of reliability
1/10/2011 Problem Solving Tools 7
10. RAND and US AFLC
• RAND adapted actuarial methods for
managing expensive, repairable equipment
such as aircraft engines ~1960
– AFI 21-104 is current version
– Actuarial forecast = n(t)a(t); demand ~Poisson
• MOD-METRIC used to buy $4B of
F100PW100 engines and spares ~1973
• USPO 5287267, Robin Roundy et al. patented
negative binomial demand distribution ~1991
1/10/2011 Problem Solving Tools 8
11. Barlow, Proschan, et al.
• What if failure rate isn t constant?
– Tests and bounds: IFR, IFRA, DMRL…
– Renewal theory, replacement, availability, maintenance
– FTA, Bayes, system vs. parts
• Coherence, redundancy, multivariate,
• Russians too: Kolmgorov, Gnedenko, Belyayev,
Gertsbakh,…
– Inspection, opportunistic maintenance
1/10/2011 Problem Solving Tools 9
12. Hungarians Too
• Asymptotic alternating renewal process (up-
down-up-down-) statistics are normally
distributed, regardless (Takacs)
– Even with dependence (1960s)
– Improve production throughput and reduce
variance, http://www.fieldreliability.com/Genie.htm
• Gozintos N next-assembly matrix (Vajda)
– Products Vector*(I-N)-1 = Parts Vector
1/10/2011 Problem Solving Tools 10
13. Kaplan-Meier npmle
• Nonparametric max. likelihood reliability
function (npmle) estimate from right-censored
ages at failures
– JASA made Ed Kaplan combine his vacuum tube
reliability paper with Paul Meier's biostatistics
paper (1957)
– For dead-forever systems, not repairable
• Odd Aalen did the same for the failure rate
function (Nelson-Aalen estimator)
1/10/2011 Problem Solving Tools 11
14. Sir David Cox PH Model
• Proportional hazards (aka relative risk) model
is a semiparametric failure rate function of
concomitant factors z (1971)
– az(t) = ao(t)e- z: is regression coeff. vector
– Easier than multivariate statistics: e.g., calendar
time and miles, operating hours
• Biostatisticians adopt PH model for testing
hypotheses about z
– Clinical trials
1/10/2011 Problem Solving Tools 12
15. Finance and Reliability
• Risk and hedging
– Black-Scholes stochastic pde for stock price S
dS = dt+ SdW: W is Brownian motion
• Nobel prize to Merton and Scholes (1997) for option price
model
• Hedging, LTCM, SIVs, CDOs, CDSs, mortgage defaults,
credit crises, deflation, deleveraging, inflation,
unemployment???
– LED deterioration resembles geometric Brownian
motion
– Scenarios include some black swans
1/10/2011 Problem Solving Tools 13
16. SORD Reliability (outline)
• Credible Reliability Prediction
– Not just MTBF (ASQ RD monograph advert)
• Parametric vs. nonparametric
– Component D
• LEDs L70
• Help! No life data
• Unforeseen consequences
• Renewal and repair
1/10/2011 Problem Solving Tools 14
17. Parametric vs. Nonparametric
Rule 1. Original data should be presented in a way that will preserve the
relevant information derived from evidence in the original data for all
predictions assessed to be useful. Walter A Shewhart
• Parametric distribution if justified
– Normal variation or asymptotic, weakest link,
exponential-Poisson-beta-binomial-Gamma-chi-square,
lognormal (rate changes), inverse Gauss,…
• Nonparametric distribution
– Preserves all information in data
– Avoids opinions and mathematical convenience
• AIC balances overfitting and likelihood
• Entropy quantifies assumed information
1/10/2011 Problem Solving Tools 15
18. Component D Weibull vs.
nonparametric
• AIC = 2k!2lnL: k = # estimated
parameters and L is likelihood function
• Entropy ! p(t)ln(p(t)) is uncertainty in a
random variable s pdf; less is better
Weibull Npmle
AIC 16.683 16.685
Entropy 0.0127 0.0135
1/10/2011 Problem Solving Tools 16
19. Black-Scholes and LEDs
Scatter Plot of Data Set 1 Normalized
1.02
1.01
1
0.99
0.98
0.97
0.96
0 730.5 1461 2191.5 2922 3652.5 4383 5113.5 5844 6574.5
Each Label is One Month in Hours
1/10/2011 Problem Solving Tools 17
20. L70: P[Age at 70% initial lumens > t]?
• Lumens at age t ~N[ t, t], independent
• Deterioration fits Black-Scholes dSt = dt+ StdWt
where St is 1-(% of initial lumens)
– Estimate and from geometric Brownian motion
– L70 ~inverse Gauss with parameters as functions of
70%, and
1/10/2011 Problem Solving Tools 18
21. L70 Weibull vs. Inverse
Gauss
LED L70 Inverse-Gaussian Mixture and Weibull
Reliability Functions
1
0.9
0.8
0.7
0.6
Reliability
IG Mixture
0.5
Weibull
0.4
0.3
0.2
0.1
0
0 2 4 6 8 10 12 14 16 18 20
Age, Years
1/10/2011 Problem Solving Tools 19
22. Help! No Life Data?
People s intuition about random sampling appears to satisfy the law of
small numbers, which asserts that the law of large numbers applies to
small numbers as well. Tversky and Kahneman
• You need ages at failures and survivors ages
• It s too hard to estimate reliability from ships
and returns counts
– Ships are counts of production, sales, installations,
or other installed base
– Returns are counts of complaints, failures, repairs,
or even spares sales
• Follow a sample by S/N? Ships and returns are
population data, required by GAAP!
1/10/2011 Problem Solving Tools 20
23. •Cases •Deaths
M/G/! and npmle •n1 •R1
• Npmle of service distribution •n2 •R2
from M/G/! queue input and
output times (1975 NLRQ) Time
• Richard Barlow and I overlooked potential for
reliability
• Works for Mt/G/! queues under mild
conditions on the nonstationary Poisson Mt
• Extended to renewal processes (recycling)
1/10/2011 Problem Solving Tools 21
24. Nplse: Actuarial Forecasts
• Orjan Hallberg (Ericsson ret.) researches
medical problems http://www.hir.nu
• Carl Harris and Ed Rattner used nplse to
forecasts AIDS deaths from HIV+!AIDS
conversions and death counts
– Carl died early of heart attack, and Ed claims he s
fully retired.
• Dick Mensing: SSE = [Expected-Observed]2
– Expected = actuarial forecast (hindcast)
1/10/2011 Problem Solving Tools 22
25. Apple: Unforeseen
Consequences
• Boss thinks ships and returns
counts are sufficient. Lit. search
=>1975 NRLQ article
• Estimate all service parts reliability,
forecast failures and recommend stock levels
• Dealers scream! Apple had required dealers
to buy obsolescent spares
• Apple bought back $36M of obsolescent
spares, for $18M, and crushed them. Made
me limit returns to ~$6M per quarter.
1/10/2011 Problem Solving Tools 23
26. Repairable Reliability (outline)
School Clip Art / TOASTER
12/19/01
• Triad Systems Corp.
• Brie Engineering M88A1
• Larry Ellison, Oracle
1/10/2011 Problem Solving Tools 24
27. Triad Systems Corp.
• New Products manager proposes auto parts
demand forecast = n(t)a(t): n(t) = cars by year
– Fails due to autocorrelation, no pun intended
– Auto parts sales might be the second, third, or ???
Stores don t know
– Derived the nplse failure rate estimates for renewal
processes ~1994. Got job. Forecasts are better.
– Extended to generalized repairable processes (first
TTF differs) and npmle ~1999
• Triad US Patent 5765143 actuarial forecast
1/10/2011 Problem Solving Tools 25
28. M88A1
• In 2000, Brie engineer
shares M88A1 drivetrain
rebuilds counts for 1990s, $186k then. Laid off
– Estimate: ~25% fail in first year. Either problem
wasn t fixed or faulty rebuild. TACOM
uninterested.
– 2005 AVDS 1790 engine backorders. RAND
publishes Velocity Management. RAND
uninterested in actuarial forecasts
– ASQ Quality Progress 2010 publishes article on
greening the engine overhaul process
1/10/2011 Problem Solving Tools 26
29. M88A1 Drivetrain Component Reliability
M88A1 1
0.9
Engine
0.8 Trans
RelayAsm
0.7
TransPTO
0.6 GenEngAC
Generator
0.5 DrvAssy
RtFdAsm
0.4 FuelPump
EngPTO
0.3
Starter
0.2 TurboC
TranCooler
0.1
0
0 5 10 15 20 25
Age at replacement, years
1/10/2011 Problem Solving Tools 27
30. Oracle and Breast Cancer
• Oracle CMM dbs record ages at system
failures and the parts that failed
– They don t identify parts by serial number,
location: TOAD, AIMS?, Other?
– What if there were duplicate parts?
• Breast cancer recurrences: same side
second time or other side???
1/10/2011 Problem Solving Tools 28
31. EM and Hidden Renewals
• EM algorithm, (Estimation-Maximization),
gives part reliability npmle
– www.wikipedia.com/EM_algorithm [Dempster,
Laird, and Rubin]
• Nplse failure rate estimates and forecasts
for renewal processes with missing data
(2008)
– Provisional patent pending application is in
procrastination
1/10/2011 Problem Solving Tools 29
32. Two-Part System
• Least Sqs is for both parts, EM is for one
Alternative Reliability Estimates
1
0.8
0.6
Least Sqs R(t)
EM R(t)
0.4
0.2
0
0 4 8 12
Age, Quarters
1/10/2011 Problem Solving Tools 30
33. You re Being Followed
• It s human nature to doubt statistically significant conclusions based on
a sample that is a small fraction of the population Tversky and Kahneman
– Pleasanton residents complain about traffic cutting
thru. City adjust signal timing to back cars onto
freeway. Crash
– City cars follow intruders. Citizens arise (2000)
– Pleasanton gives traffic count data
– Nplse of O-D matrix and travel time distributions
– Traffic manager doesn t understand O-D,
probability distributions, and their use
– City stations cheap labor at major intersections to
record license numbers (2009)
1/10/2011 Problem Solving Tools 31
35. Network Tomography
Southbound: Foothill,
Hopyard-Hacienda-
Owens, Santa Rita
Eastbound: Las
Positas, Westbound:
Source-Sink
Stoneridge, Stanley Blvd
Foothill
Northbound:
Sunol Blvd.
1/10/2011 Problem Solving Tools 33
36. Pleasanton PM OD matrix
• AKA network tomography
Pmatrix Pton Thru
origin Pton
O from From 0 From N From S From E From W Lambda go g1
D to-> 0
To 0 0.0000 0.8640 0.0000 0.0000 0.7801 6.5128 0.9924 0.8541
To N 0.2136 0.0000 0.0135 0.0000 0.0721 5.9121 0.0001 0.0802
To S 0.1801 0.0285 0.0000 1.0000 0.0000 0.0000 0.0075 0.0656
To E 0.1755 0.0177 0.2679 0.0000 0.1479 0.0000 0.0000 0.0000
To W 0.4308 0.0899 0.7186 0.0000 0.0000
1/10/2011 Problem Solving Tools 34
37. Dealing with Uncertainty
The analyst should provide a measure of the uncertainty that results from the
assumptions underpinning the set of models applied in the analysis and the
deliberate and unconscious simplifications made. Terje Aven
• Randomness (aleatory uncertainty)
– Reliability function, bounds, and stochastic dominance
• Sample uncertainty vs. population
– Why sample if you can get population statistics?
• Epistemic, Knightian, unknown unknowns…
– PRA and Uncertainty in the URC
– Jackknife, bootstrap, broom charts…
– Nonparametric extrapolations
– Scenarios
1/10/2011 Problem Solving Tools 35
38. Component D
• Given first year of monthly failure counts, how
many will fail in remainder of 3-year warranty?
– Data are left and right censored. All failure counts were
collected on one calendar date. Monthly ships too
– Some failures are 12 months old, some 11 months….
• I do not think that a nonparametric approach
would work.
– It works: facilitates extrapolation, uncertainty
– Weibull reliability under-forecasts failures
1/10/2011 Problem Solving Tools 36
39. Alternative Reliability
Estimates
• ! 12 months of ships and failures
1
0.9995 npmle
Weibull mle
nplse
0.999
Naïve
mle Weibull
0.9985 lse Weibull
0.998
0 3 6 9 12
Age, Months
1/10/2011 Problem Solving Tools 37
42. Extrapolation Scenarios
• Nonparametric linear extrapolations
– Jackknife; leave out one month s data
– Broom; all 12 months, first 11, first 10…
• W. Weibull recommends power functions for
simplicity
• Sensitivity and delta method:
– derivatives of actuarial forecasts wrt linear
extrapolation coeffs are n(t) and tn(t)
• Future uncertainty???
1/10/2011 Problem Solving Tools 40
43. Possible Reliability Futures
• MTBF no longer a specification?
• Less Weibull? More inverse Gauss?
• Consumer bills of rights? WikiReliability?
– Do not track by serial number or name (privacy), unless
reduced sample uncertainty is worth the costs
• More uncertainty and risk analysis?
– Risk equity, FMERD…
– Dempster-Shaefer Theory of Evidence, belief
– Statisticians work on causal inference and vv
• What do you think? What s needed?
1/10/2011 Problem Solving Tools 41
44. REFERENCES
• AFI 21-104, Selective Management of Selected Gas Turbine Engines, Air Force Instruction
21-104, Air Force Material Command, June 1994, http://afpubs.hq.af.mil
• McLinn, James, A Short History of Reliability, ASQ Reliability Review, Vol. 30, No. 1, pp.
11-18, March 2010
• Barlow, Richard E. and Frank Proschan, Historical Background of the Mathematical Theory
of Reliability, in chapter 1 of Mathematical Theory of Reliability, John Wiley, SIAM, New
York, 1965
• Geisler, Murray and H. W. Karr, The design of military supply tables for spare parts,
Operations Research, Vol. 4, No. 4, pp. 431-442, 1956
• Kamins, Milton and J. J. McCall, Rules for Planned Replacement of Aircraft and Missile
Parts, RAND RM-2810-PR, Nov. 1961
• Saleh, J. H. and K. Marais, Highlights from the early (and pre-) history of reliability
engineering, Reliability Engineering and System Safety, Vol. 91, No. 2, pp. 249-256, Feb. 2006
• ISO 26000, Guidance on Social Responsibility, Draft International Standard, 2009
• Lee, Miky, Craig Hillman, and Duksoo Kim, How to predict failure mechanisms in LED and laser
diodes, Aug. 2005, http://www.dfrsolutions.com/uploads/publications/2005_MAE_LED_article.pdf
1/10/2011 Problem Solving Tools 42
45. References by George
• Estimation of a Hidden Service Distribution of an M/G/! Service System, Naval Research
Logistics Quarterly, pp. 549-555, September 1973, Vol. 20, No. 3. co-author A. Agrawal
• A Note on Estimation of a Hidden Service Distribution of an M/G/! Service System,
Random Samples, ASQC Santa Clara Valley June 1994
• Origin-Destination Proportions and Travel-Time Distributions Without Surveys, INFORMS
Salt Lake City, May 2000, http:/www.fieldreliability.com/OD.ppt
• Biomedical Survival Analysis vs. Reliability: Comparison, Crossover, and Advances, The J.
of the RIAC, pp. 1-5. Q4-2003, http://www.theriac.org/DeskReference/viewDocument.php?
id=85&Scope=reg
• Failure Modes and Effects Risk Diagnostics, http://www.fieldreliability.com/FMERD.htm
• Nonparametric Forecasts from Left-Censored Failures,
http://www.fieldreliability.com/QPMeeker.doc, Dec. 2010
• LED Reliability Analysis, ASQ Reliability Review, Vol. 30. No. 4, pp.4-11,
http://www.fieldreliability.com/PhilLEDs.doc, Dec. 2010
• Credible Reliability Prediction, ASQ Reliability Division Monograph,
http://www.asq.org/reliability/quality-information/publications-reliability.html, 2003
• Nonparametric Forecasts From Left-Censored Data,
http://www.fieldreliability.com/QPMeeker.doc, Dec. 2010
1/10/2011 Problem Solving Tools 43