2. 2
The importance of reliability
Electrical, electronic and Mechanical equipment is used in a number
of fields â in industry for the control of processes, in computers, in
medical electronics, atomic energy, in weapon systems, defence
equipments, communications, navigation at sea and in the air, and
in many other fields. It is essential that this equipment should
operate reliably under all the conditions in which it is used. In the
air navigation, military and atomic energy fields, for instance, failure
could result in a dangerous situation.
Very complicated systems, involving large numbers of separate
units, such as avionic and aerospace electronic systems are coming
into use more and more. These systems are extremely complex and
use a large number of component parts. As each individual part is
liable to failure, the overall reliability will decrease unless the
reliability of each component part can be improved.
3. Mechanical reliability
The well-reported failures, such as the Space Shuttle Challenger, Chernobyl
nuclear accidents, and the Bhopal gas escape, emphasize vividly the necessity for
mechanical reliability.
Buildings, bridges, transit systems. railways, automotive systems, robots, offshore
structures, oil pipe lines and tanks, steam turbine plates, roller bearings, etc., all
have their particular modes of failure affecting their reliability.
There are a number of common modes of mechanical failures, which are worth
listing, e.g. with structures:
(1)Corrosion failures
(2) Fatigue failures
(3) Wear failures
(4) Fretting failures
(5) Creep failures
(6) Impact failures
These may be considered the main failure modes, but there are of course many
others, such as ductile rupture, thermal shock, galling, brinelling, spalling,
radiation damage, etc.
A âfailureâ is any inability of a part or equipment to carry out its
3
specified function.
4. Reliability Engineering
⢠Reliability engineering is an engineering field that
deals with the study, evaluation, and life-cycle
management of reliability: the ability of a system or
component to perform its required functions under
stated conditions for a specified period of time
⢠Reliability engineering is a sub-discipline
within systems engineering. Reliability is often
measured as probability of failure, frequency of
failures, or in terms of availability, a probability derived
from reliability and
maintainability. Maintainability and maintenance are
often important parts of reliability engineering.
5. Well-publicized system failures such as those listed below may have
also contributed to more serious consideration of reliability in product
design
⢠Space Shuttle Challenger Disaster:
This debacle occurred in 1986, in which all crew
members lost their lives. The main reason for this
disaster was design defects.
⢠Chernobyl Nuclear Reactor Explosion:
This disaster also occurred in 1986, in the former
Soviet Union, in which 31 lives were lost. This
debacle was also the result of design defects.
⢠Point Pleasant Bridge Disaster:
This bridge located on the West Virginia/ Ohio border
collapsed in 1967. The disaster resulted in the loss
of 46 lives and its basic cause was the metal fatigue
of a critical eye bar.
6. RELIABILITY SPECIALIZED AND
APPLICATION AREAS
⢠Mechanical reliability
This is concerned with the reliability of mechanical
items. Many textbooks and other publications have
appeared on this topic.
Example:
ďą Critical mechanical component assessment
ďą Shaft strength
ďą Selection of flexible couplings and transmission brakes
ďą Gear life assessment; screening of belt drives
ďą Assessment of bearing life, load ratings of slider bearings and shaft
sealing devices
ďą Bolt loading and lubrication systems
7. ⢠Software reliability.
This is an important emerging area of reliability as
the use of computers is increasing at an alarming
rate.
⢠Human reliability.
In the past, many times systems have failed not due
to technical faults but due to human error. The
first book on the topic appeared in 1986
⢠Reliability optimization.
This is concerned with the reliability optimization of
engineering systems
⢠Reliability growth.
This is basically concerned with monitoring
reliability growth of engineering systems during
their design and development
8. ⢠Structural reliability.
This is concerned with the reliability of
engineering structures, in particular civil
engineering
⢠Power system reliability.
This is a well-developed area and is basically
concerned with the application of
reliability principles to conventional power
system related problems. Many books on
the subject have appeared over the years
including a vast number of other
publications
9. ⢠Robot reliability and safety.
This is an emerging new area of the application
of basic reliability and safety principles to robot
associated problems.
⢠Life cycle costing.
This is an important subject that is directly
related to reliability. In particular, when
estimating the ownership cost of the product,
the knowledge regarding its failure rate is
essential.
⢠Maintainability.
This is closely coupled to reliability and is
concerned with the maintaining aspect of the
product.
10. RELIABILITY HISTORY
⢠The history of the reliability discipline goes back to the
early 1930s when probability concepts were applied to
electric power generation related problems. During
World War II, Germans applied the basic reliability
concepts to improve reliability of their V1 and V2
rockets.
⢠In 1947, Aeronautical Radio, Inc. and Cornell University
conducted a reliability study of over 100,000 electronic
tubes. In 1950, an ad hoc committee on reliability was
established by the United States Department of
Defense and in 1952 it was transformed to a
permanent body: Advisory Group on the Reliability of
Electronic Equipment (AGREE).
10
11. RELIABILITY HISTORY
⢠In 1951, Weibull published a statistical function that subsequently
became known as the Weibull distribution. In 1952, exponential
distribution received a distinct edge after the publication of a paper,
presenting failure data and the results of various goodness-of-fit
tests for competing failure distribution, by Davis.
⢠In 1954, a National Symposium on Reliability and Quality Control was
held for the first time in the United States and in the following year,
the Institute of Electrical and Electronic Engineers (IEEE) formed an
organization called the Reliability and Quality Control Society. During
the following two years, three important documents concerning
reliability appeared: 1956: a book entitled Reliability Factors for
Ground Electronic Equipment, 1957: AGREE report, 1957: first
military reliability specification: MIL-R-25717 (USAF): Reliability
Assurance Program for Electronic Equipment.
⢠In 1962, the Air Force Institute of Technology of the United States Air
Force (USAF), Dayton, Ohio, started the first masterâs degree
program in system reliability engineering. Nonetheless, ever since
the inception of the reliability field many individuals have
contributed to it and hundreds of publications on the topic have
appeared.
11
12. TERMS AND DEFINITIONS
⢠Reliability: This is the probability that an item will
carry out its assigned mission satisfactorily for the
stated time period when used under the specified
conditions.
⢠Failure: This is the inability of an item to function
within the initially defined guidelines.
⢠Downtime: This is the time period during which the
item is not in a condition to carry out its stated
mission.
⢠Maintainability: This is the probability that a failed
item will be repaired to its satisfactory working state.
⢠Redundancy :This is the existence of more than one
means for accomplishing a defined function.
12
13. Active redundancy: This is a type of redundancy when all redundant
items are operating simultaneously.
Availability: This is the probability that an item is available for
application or use when needed.
Useful life: This is the length of time an item operates within an
acceptable level of failure rate.
Mission time: This is the time during which the item is performing its
specified operating condition.
Human error: This is the failure to perform a given task (or the
performance of a forbidden action) that could lead to disruption of
scheduled operations or result in damage to property/equipment.
Human reliability: This is the probability of completing a job/task
successfully by humans at any required stage in the system operation
within a defined minimum time limit (if the time requirement is
specified).
13
14. MEAN TIME BETWEEN FAILURES (MTBF): The mean exposure
time between consecutive failures of a component. This applies to
repairable items, and means that if an item fails, say 5 times over
a period of use totaling 1000hours, the MTBF would be 1000/5 or
200hours.
MEAN TIME BETWEEN MAINTENANCE (MTBM): The average
time between all maintenance events that cause downtime, both
preventative and corrective maintenance, and also includes any
associated logistics delay time.
MEAN TIME TO FAILURE (MTTF): Mean Time To Failure (MTTF): It
is the average time that elapses until a failure occurs. MTTF is
commonly found for non repairable items such as fuses or bulbs,
etc.
14
15. NEED OF RELIABILITY IN PRODUCT DESIGN
⢠There have been many factors responsible for the
consideration of reliability in product design including
product complexity, insertion of reliability related-clauses
in design specifications, competition, awareness of cost
effectiveness, public demand, and the past system failures.
Some of these factors are described below in detail.
⢠Even if we consider the increase in the product complexity
with respect to parts alone, there has been a phenomenal
growth of some products. For example, today a typical
Boeing 747 jumbo jet airplane is made up of approximately
4.5 million parts, including fasteners. Even for relatively
simpler products, there has been a significant increase in
complexity with respect to parts. For example, in 1935 a
farm tractor was made up of 1200 critical parts and in 1990
the number increased to around 2900.
15
16. RELIABILITY IN THE PRODUCT DESIGN PROCESS
⢠Reliability of the design, to a large extent, is determined by
the reliability tasks performed during the product design.
⢠These reliability tasks include: establishing reliability
requirements definition, using reliability design
standards/guides/checklists, allocating reliability, predicting
reliability, reliability modeling, monitoring
subcontractor/supplier reliability activities, performing
failure modes effects and criticality analysis, monitoring
reliability growth, assessing software reliability, preparing
critical items list, and performing electronic parts/circuits
tolerance analysis.
⢠Reliability tasks such as those listed above, if performed
effectively, will contribute tremendously to the product
design.
16
17. NEED OF QUALITY IN PRODUCT DESIGN
⢠The importance of quality in business and
industry is increasing rapidly because of factors
such as competition, growing demand from
customers for better quality, increasing number
of quality-related lawsuits, and the global
economy. Nonetheless, the cost of quality
control accounts for around 7â10% of the total
sales revenue of manufacturers. Today,
companies are faced with reducing this amount
and at the same time improving the quality of
products and services for their survival in the
internet economy. 17
18. Reliability Engineering Department Responsibilities
A reliability engineering department may have various kinds of
responsibilities. However, the major ones are as follows:
⢠Establishing reliability policy, plans and procedures
⢠Reliability allocation
⢠Reliability prediction
⢠Specification and design reviews with respect to reliability
⢠Reliability growth monitoring
⢠Providing reliability related inputs to design specifications and proposals
⢠Reliability demonstration
⢠Training reliability manpower and performing reliability-related research
and development work
⢠Monitoring the reliability activities of subcontractors, if any
⢠Auditing the reliability activities
⢠Failure data collection and reporting
⢠Failure data analysis
⢠Consulting
18
19. Definition of Reliability
⢠Reliability is the probability of a device
performing its purpose adequately for the
period intended under the given operating
conditions
This definition focus four important factors
ď the reliability of a device is expressed as a probability
ď the device is required to give required performance
ď the duration of performance
ď the operating conditions are prescribed.
19
20. Definition of Maintainability
Maintainability is a measure of the speed with which
loss of performance is detected, diagnosed and made
good.
Maintainability is the probability that a unit or system
will be restored to specified conditions within a given
period when maintenance action is taken in accordance
with prescribed procedures and resources.
It is a characteristic of the design and installation of the
unit or system.
The âavailabilityâ or time an equipment is functioning
correctly while in use depends both on reliability and on
maintainability.
20
21. Definition of Availability
Availability. Availability is defined as the percentage of
time that a system is available to perform its required
function(s).
It is measured in a variety of ways, but it is principally
a function of downtime.
Availability can be used to describe a component or
system but it is most useful when describing the nature
of a system of components working together. Because it
is a fraction of time spent in the âavailableâ state, the
value can never exceed the bounds of 0 < A < 1. Thus,
availability will most often be written as a decimal, as
in 0.99999, as a percentage, as in 99.999%,
21
22. Availability
⢠Availability
This is the probability that an item is available for
application or use when needed.
ďąMaintainability together with reliability
determine the availability of a machinery
system. Availability is influenced by the time
demand made by preventive and corrective
maintenance measures.
ďąAvailability(A) is measured by:
A= MTBF/MTBF + MTTR
23. 23
Quality and reliability
The quality of a device is the degree of performance to
applicable specification and workmanship standards.
What is the difference between Quality and Reliability?
Quality means good performance and longevity.
Quality of any manufactured product is determined by its design,
the materials from which it is made and the processes used in its
manufacture.
Quality control measures performance and its variations from
specimen to specimen by statistical methods to determine
whether production satisfies the design requirements.
Quality of a product is determined by conformity and reliability.
In Reliability it matters how long a product will maintain its
original characteristics when in operation.
24. 24
Reliability activity in system design
For large engineering systems, management of design and reliability becomes
an important issue.
Reliability design begins with the development of a (system) model. Reliability
and Availability models use block diagrams and fault trees to provide a
graphical means of evaluating the relationships between different parts of the
system. These models may incorporate predictions based on failure rates
taken from historical data. While the (input data) predictions are often not
accurate in an absolute sense, they are valuable to assess relative differences
in design alternatives. Maintainability parameters, for example MTTR, are
other inputs for these models.
The most important fundamental initiating causes and failure mechanisms are
to be identified and analyzed with engineering tools.
A diverse set of practical guidance and practical performance and reliability
requirements should be provided to designers so they can generate low-stressed
designs and products that protect or are protected against damage
and excessive wear.
25. 25
A Fault Tree Diagram
One of the most important design techniques is redundancy. This means that if
one part of the system fails, there is an alternate success path, such as a backup
system. By creating redundancy, together with a high level of failure monitoring
and the avoidance of common cause failures, even a system with relative bad
single channel (part) reliability, can be made highly reliable (mission reliability)
on system level.
26. Furthermore, by using redundancy and the use of dissimilar design and
manufacturing processes (different suppliers) for the single independent
channels, very high levels of reliability can be achieved at all moments of
the development cycles (early life times and long term).
Redundancy can also be applied in systems engineering by double
checking requirements, data, designs, calculations, software and tests to
overcome systematic failures.
Another design technique to prevent failures is called physics of failure.
This technique relies on understanding the physical static and dynamic
failure mechanisms. It accounts for variation in load, strength and stress
leading to failure at high level of detail, possible with use of
modern Finite Element Method (FEM) software programs that may
handle complex geometries and mechanisms like creep, stress relaxation,
fatigue and probabilistic design (Monte Carlo simulations / DOE). The
material or component can be re-designed to reduce the probability of
failure and to make it more robust against variation.
26
27. Another common design technique is component derating: Selecting
components whose tolerance significantly exceeds the expected stress, as
using a heavier gauge wire that exceeds the normal specification for the
expected electrical current.
Another effective way to deal with unreliability issues is to perform
analysis to be able to predict degradation and being able to prevent
unscheduled down events / failures from occurring. RCM(Reliability
Centered Maintenance) programs can be used for this.
27
Many tasks, techniques and analyses are specific to particular industries
and applications. Commonly these include:
ďˇ Built-in test (BIT) (Testability analysis)
ďˇ Failure mode and effects analysis (FMEA)
ďˇ Reliability Hazard analysis
ďˇ Reliability Block Diagram analysis
ďˇ Fault tree analysis
ďˇ Root cause analysis
28. 28
ďˇ Accelerated Testing
ďˇ Reliability Growth analysis
ďˇ Weibull analysis
ďˇ Thermal analysis by Finite Element Analysis (FEA) and / or Measurement
ďˇ Thermal induced, shock and vibration fatigue analysis by FEA and / or
Measurement
ďˇ Electromagnetic analysis
ďˇ Statistical interference
ďˇ Predictive and preventive maintenance: Reliability Centered Maintenance
(RCM) analysis
ďˇ Human error analysis
ďˇ Operational Hazard analysis
Results are presented during the system design reviews and logistics reviews.
Reliability is just one requirement among many system design requirements.
29. n
n A P A A A A P ď ď ď˝ ďŤ ďŤ ďŤ ď
( ... ) 1 (1 ( ))
29
Probability Basics
As the basis for reliability theory is probability, this section presents basic
properties of probability. Some of these properties are as follows
⢠The probability of occurrence of event, say A, is
O ⤠P(A) ⤠1 (2.11)
⢠Probability of the sample space S is
P(S) = 1 (2.12)
⢠Probability of the negation of the sample space S is
P(SÂŻ) = 1 (2.13)
Where S is the negation of the sample space S.
⢠The probability of the union of n independent events is
1 2 3 i
(2.14)
ď˝
1
i
Where Ai is the I th event; for i = 1, 2, âŚ, n.
P (Ai) is the probability of occurrence of event Ai ; for i = 1, 2, âŚ, n.
For n = 2, Equation (2.14) reduces to
30. ( ) ( ) ( ) ( ) ( ) 1 2 1 2 1 2 P A ďŤ A ď˝ P A ďŤ P A ď P A P A
(2.15)
30
⢠The probability of the union of n mutually exclusive events is
ďĽ
n i P A A A A P A
1 2 3 ( ... ) ( )
ď˝
ďŤ ďŤ ďŤ ď˝
n
i
1
(2.16)
⢠The probability of an intersection of n independent events is
( ) ( ) ( ) ( ) 1 2 n 1 2 n P A A ď ď ď A ď˝ P A P A ď ď ď P A
(2.17)
⢠The probability of occurrence and nonoccurrence of an event, say A, is
__
P A ďŤ P A ď˝
( ) ( ) 1
(2.18)
where
P (A) is the probability of occurrence of A.
P (A) is the probability of nonoccurrence of A.
31. 31
DEFINITION OF PROBABILITY
This is expressed as
P(A) lim(N / n)
nďŽďĽ
ď˝
Where
P(A) is the probability of occurrence of event A.
N is the number of times that A occurs in the n repeated experiments.
Bayes' theorem :
Mathematically, Bayes' theorem gives the relationship between
the probabilities of A and B, P(A) and P(B), and the conditional
probabilities of A given B and B given A, P(A|B) and P(B|A).
In its most common form, it is:
32. Distribution functions
A more useful diagram, for continuous data, is the probability density function.
The y axis is the percentage measured in a range(shown on the x-axis) rather
than the frequency as in a histogram. If you reduce the ranges(or intervals) then
the histogram becomes a curve which describes the distribution of the
measurements or values.
This distribution is the probability density function or PDF. Figure 4, below,
shows an example of a PDF. The area under the curve of the distribution is
equal to 1, i.e.
The probability of a value falling between any two values x1and x2 is the
area bounded by this interval, i.e.
32
33. In reliability since we are usually discussing time we will change x to t, i.e. f(t).
The cumulative distribution function or CDF, F(t), gives the probability that a
measured value will fall between -â and t, i.e.
33
Cumulative Distribution Function
34. 34
Figure 5, below, shows the CDF as x tends to â F(t) tends to 1.
35. In reliability engineering we are concerned with the probability that an item
will survive for a stated interval of time (or cycles or distance etc.) i.e. there is
no failure in the interval (0 to t). This is known as the survival function or
Reliability function and is given by R(t). From the definition:
35
36. KEY POINTS
⢠Reliability is a measure of uncertainty and therefore
estimating reliability means using statistics and
probability theory
⢠Reliability is quality over time
⢠Reliability must be designed into a product or service
⢠Most important aspect of reliability is to identify cause
of failure and eliminate in design if possible otherwise
identify ways of accommodation
⢠Reliability is defined as the ability of an item to
perform a required function without failure under stated
conditions for a stated period of time
⢠The costs of unreliability can be damaging to a
company
36