2. The Need for Safety
Instrumentation
Managing and equipping industrial plant with the
right components and sub-systems for optimal
operational efficiency and safety is a complex
task. Safety Systems Engineering (SSE)
describes a disciplined, systematic approach,
which encompasses hazard identification, safety
requirements specification, safety systems
design and build, and systems operation and
maintenance over the entire lifetime of plant.
The foregoing activities form what has become
known as the “safety Life-cycle” model, which is
at the core of current and emerging safety
related system standards.
3. Risk and Risk Reduction
Methods
Safety Methods employed to protect against or mitigate
harm/damage to personnel, plant and the environment,
and reduce risk include:
• Changing the process or engineering design
• Increasing mechanical integrity of the system
• Improving the Basic Process Control System (BPCS)
• Developing detailed training and operational procedures
• Increasing the frequency of testing of critical system
components
• Using a safety Instrumented System (SIS)
• Installing mitigating equipment
4.
5. Other terms used for safety
systems are:
Safety Instrumented Systems (SIS),
Emergency Shutdown System (ESD),
Safety Related System (SRS), or
E/E/PE Safety Related System (E/E/PE =
Electric/Electronic/Programmable
Electronic)
6. objectives of a shutdown
control system
1- Protection of life
2- Protection of plant equipment
3- Avoidance of environmental pollution
4- Maximizing plant production i.e avoiding
unnecessary shutdowns
7. Safety, Reliability, and
Availability
a) Safety
Safety means a sufficient protection from
danger.
• Safety related controls are needed e.g. for
trains, lifts, escalators, burns, etc. The
safe controls must be designed in a way
that any component fault and other
imaginable influences do not cause
dangerous states in the plant.
8. The safe state
is the state to which a system can be put out of
its current operational state and which has a
system specific lower hazard potential than the
operational state. The absolutely safe with the
lowest amount of energy involved. Quite often it
is not possible to obtain the safe state without
any danger involved, just by switching the device
off (e.g. a plane). The plane in the airtaken as a
system- has no safe state. Here the risk can only
be reduced by redundant equipment (e.g. for
propulsion and navigation systems).
9. Safety
is measured primarily by a parameter
called Average Probability of Failure
on Demand (PFDavg). This indicates
the chance that a SIS will not perform
its preprogrammed action during a
specified interval of time (usually the
time between periodic inspections).
10. Reliability
Reliability is the ability of a technical device to fulfill its
function during its operation time.
This is often no longer possible if one component has a
failure. So the MTBF (Mean Time
Between Failure) is often taken as a measurement of
reliability. It can either be calculated
statistically via systems in operation or via the failure
rates of the components applied.
The reliability does not say anything about the safety of a
system! Unreliable systems are safe if
an individual failure put the plant to the safe state each
time.
11. Availability
Availability is the probability of a system being a
functioning one. It is expressed in per cent and defines
the mean operating time between two failures (MTBF)
and the mean down time (MDT), according to the
following formula:
The mean down time (MDT) consists of the fault detection time and-
in modular systems- the time it takes to replace defective modules.
The availability of a system is greatly increased by a short fault
detection time. Fast fault detection in modern electronic systems is
obtained via automatic test routines and a detailed diagnostic display.
12. The availability can be increased through redundancy, e.g. central devices
working in parallel, IO modules or multiple sensors on the same measuring point.
The redundant components are put up in a way that the function of the system is
not affected by the failure of one component.
Here as well a detailed diagnostic display is an important element of availability.
Measures designed to increase availability have no effect on the safety. The
safety of redundant systems is however only guaranteed, if there are automatic
test routines during operation or if e.g. non–safety related sensor circuits in 2-oo-
3 order are regularly checked. If one component fails, it must be possible to
switch off the defective part in a safe way.
A related measure is called Safety Availability. It is defined as the probability that
a SIS will perform its preprogrammed action when the process is operating. It
can be calculated as
follows:
Safety Availability = 1 – PFDavg
Another parameter is called the Risk Reduction Factor (RRF). It represents the
ratio of risk
without a SIS divided by the risk with a SIS. It can be calculated as follows:
PRF = 1/PFDavg
13. What is hazard and what is
risk?
A hazard is ‘an inherent physical or
chemical characteristic that has the
potential for causing harm to people,
property, or the environment’. In chemical
processes, ‘It is the combination of a
hazardous material, an operating
environment, and certain unplanned
events that could result in an accident’.
14. Hazards Analysis
Generally, the first step in determining the levels of
protective layers required involves conducting a detailed
hazard and risk analysis. In the process industries a
Process Hazards Analysis (PHA) is generally
undertaken, which may range from a screening analysis
through to a complex Hazard and Operability (HAZOP)
study, depending on the complexity of operations and
severity of the risks involved. The latter involves a
rigorous detailed process examination by a multi-
disciplinary team comprising process, instrument,
electrical and mechanical engineers, as well as safety
specialists and management representatives.
15. Risk
‘Risk is usually defined as the combination
of the severity and probability of an event.
In other words, how often can it happen
and how bad is it when it does happen?
Risk can be evaluated qualitatively or
quantitatively.’ Roughly,
16. Risk reduction
Risk reduction can be achieved by reducing either the
frequency of a hazardous event or its consequences or
by reducing both of them. Generally, the most desirable
approach is to first reduce the frequency since all events
are likely to have cost implications, even without dire
consequences.
Safety systems are all about risk reduction. If we can’t
take away the hazard we shall have to reduce the risk.
This means: Reduce the frequency and / or reduce the
consequence
The basic definitions of the safety related terminologies
will be studied in this course; there are three main
examples of the required safety actions as follow:
17. Emergency Shutdown (ESD)
Typical actions from ESD systems are:
• Shutdown of part systems and equipment;
• Isolate hydrocarbon inventories;
• Isolate electrical equipment;
• Prevent escalation of events;
• Stop hydrocarbon flow;
• Depressurize / Blow down;
• Emergency ventilation control;
• Close watertight doors and fire doors.
18. Process Shutdown (PSD)
A process shutdown is defined as the automatic isolation
and de-activation of all or part of a process. During a
PSD the process remains pressurized. Basically PSD
consists of field-mounted sensors, valves and trip relays,
a system logic unit for processing of incoming signals,
alarm and HMI units. The system is able to process all
input signals and activating outputs in accordance with
the applicable Cause and Effect charts.
Typical actions from PSD systems are:
• Shutdown the whole process;
• Shutdown parts of the process;
• Depressurize / Blowdown parts of the process.
19. Fire and Gas Control (F&G)
This is denoted as Fire Detection and Protection system
FDP in some other definitions. FDP provides early and
reliable detection of fire or gas, wherever such events
are likely to occur, alert personnel and initiate protective
actions automatically or manually upon operator
activation.
Basically the system consists of field-mounted detection
equipment and manual alarm stations, a system logic
unit for processing of incoming signals, alarm and HMI
units. The system shall be able to process all input
signals in accordance with the applicable Fire Protection
Data Sheets or Cause & Effect charts. FDP SIL
requirements typically range from SIL 2, SIL 1 or defined
as a system without SIL requirement pending on the risk
analysis.
20. Typical actions from FDP
systems are:
• Alert personnel;
• Release fire fighting systems;
• Emergency ventilation control;
• Stop flow of minor hydrocarbon sources such as
diesel distribution to consumers;
• Isolate local electrical equipment (may be done
by ESD);
• Initiating ESD and PSD actions;
• Isolate electrical equipment;
• Close watertight doors and fire doors.
21. Emergency Shutdown (ESD)
The Emergency Shutdown System (ESD) shall minimize
the consequences of emergency situations, related to
typically uncontrolled flooding, escape of hydrocarbons,
or outbreak of fire in hydrocarbon carrying areas or
areas which may otherwise be hazardous. Traditionally
risk analyses have concluded that the ESD system is in
need of a high Safety Integrity Level, typically SIL 2 or 3.
Basically the system consists of field-mounted sensors,
valves and trip relays, system logic for processing of
incoming signals, alarm and HMI units. The system is
able to process input signals and activating outputs in
accordance with the Cause & Effect charts defined for
the installation.
22. Typical actions from ESD
systems are:
• Shutdown of part systems and equipment
• Isolate hydrocarbon inventories
• Isolate electrical equipment (*)
• Prevent escalation of events
• Stop hydrocarbon flow
• Depressurize / Blowdown
• Emergency ventilation control (*)
• Close watertight doors and fire doors(*)
23. Process Shutdown (PSD)
The Process Shutdown system ensures a rapid
detection and safe handling of process upsets.
Traditionally risk analyses have concluded that the PSD
system is in need of low to medium Safety Integrity
Level.
The reason for a low to medium requirement, being that
PSD systems built in accordance with API RP 14C have
requirements for both primary (the computerized system)
and secondary (mechanical devices) protection.
Basically the system consists of fieldmounted sensors,
valves and trip relays, a system logic unit for processing
of incoming signals, alarm and HMI units. The system is
able to process all input signals and activating outputs in
accordance with the applicable Cause & Effect charts.
24. Typical actions from PSD
systems are:
• Shutdown the whole process
• Shutdown parts of the process
• Depressurize /Blowdown parts of the
process
25. Fire / gas Detection and
Protection (FDP)
Typical actions from FDP systems are:
• Alert personnel
• Release fire fighting systems
• Emergency ventilation control (*)
• Stop flow of minor hydrocarbon sources such as diesel
distribution to consumers. (*)
• Isolate local electrical equipment (may be done by ESD)
• Initiating ESD and PSD actions
• Isolate electrical equipment (*)
• Close watertight doors and fire doors(*)
(*) - May alternatively form a part of the Emergency
ShutDown system
26. Safety Process General
Overview
Safety by definition is the “absence of
risk”. There is risk in everything we do, so
the safety
process model is designed to effectively
identify & reduce risk. This includes:
• Physical plant risk;
• Human factor-related risk;
• Attitudinal Risk.
27. Sustained improvements in accident prevention can only
come from changes to the overall mix of the above
factors.
The model defines Workplace risk as a formula such
that:
RISK = Employee Exposure X Probability of the Accident
Sequence Taking Place = Potential Consequence of the
Accident
Noting that Risk = Consequence x Frequency and
Frequency = Demand rate x Probability of failure of the
safety function
We can define Five-Step Safety Process Model as
follows:
29. • Step 1: Identification of risks that are
producing accidents and injuries.
• Step 2: Perform accident / incident
problem-solving on each identified risk:
1. Process includes:
2. Definition of problem
3. Contributing factors
4. Root Causes
• Step 3: Develop a schedule for
implementation of each preventive action
Preventive action should all have
1. Responsible party
2. Resources to support actions
3. Timetable for completion:
30. Step 4: Continuously measure to ensure
preventive actions are working as expected.
Measure timetable to ensure each action is
enabled.
Step 5: Employees involved in work
environment must be given feedback on a
continuous basis.
(i.e. positive reinforcement).
32. Risk Evaluation
There is no such thing as zero risk. This is
because no physical item has a zero
failure rate, no human being makes zero
errors and no piece of software design can
foresee every possibility.
33. Key Questions to Ask
A process control engineer implementing a
Safety Instrumented System must answer
several
questions:
1. What level of risk is acceptable?
2. How many layers of protection are
needed?
3. When is a Safety Instrumented System
required?
4. Which architecture should be chosen?
34. Risk assessment
The measurement of risk
Quantitative scale:
• Minor – Injury to one person involving less than 3 days
absence from work
• Major – Injury to one person involving more than 3 days
absence from work
• Fatal consequences for one person
• Catastrophic – Multiple fatalities and injuries.
Qualitative scale
Unlikely
• Possible
• Occasionally
• Frequently
• Regularly
35. Alternatively
• One hazardous event occurring on the
average once every 10 years will have an
event frequency of 0.1 per year.
• A rate of 10−4 events per year means that
an average interval of 10 000 years can
be expected between events.
36. Another alternative is to use a semi-quantitative
scale or band of frequencies to match up words
to frequencies. For example:
• Possible = Less than once in 30 years
• Occasionally = More than once in 30 years but less
than once in 3 years
• Frequently = More than once in 3 years
• Regularly = Several times per year.
Once we have these types of scales agreed, the
assessment of risk requires that for each hazard
we are able to estimate both the likelihood and the
consequence. For example:
• Risk item no. 1 – ‘Major’ injury likely to occur
‘Occasionally’
• Risk item no. 2 – ‘Minor’ injury likely to occur
‘Frequently’.
42. Concepts of Alarp and tolerable risk
The Alarp (as low as reasonably practicable) principle recognizes
that there are three broad categories of risks:
• Negligible risk: Broadly accepted by most people as they go about
their everyday lives, these would include the risk of being struck by
lightning or of having brake failure in a car.
• Tolerable risk: We would rather not have the risk but it is tolerable in
view of the benefits obtained by accepting it. The cost in inconvenience
or in money is balanced against the scale of risk, and a compromise is
accepted.
• Unacceptable risk: The risk level is so high that we are not prepared to
tolerate it. The losses far outweigh any possible benefits in the situation.
44. Step 1
The estimated level of risk must first be
reduced to below the maximum level of
the Alarp region at all costs.
This assumes that the maximum acceptable
risk line has been set as the maximum
tolerable risk for the society or industry
concerned. This line is hard to find, as we
shall see in a moment.
45. Step 2
Further reduction of risk in the Alarp region requires cost
benefit analysis to see if it is justified. This step is a bit
easier and many companies define cost benefit formulae
to support cost justification decisions on risk-reduction
projects.
The principle is simple ‘If the cost of the unwanted scenario
is more than the cost of improvement the risk reduction
measure is justified’.
The tolerable risk region remains the problem for us. How
do we work out what is tolerable in
terms of harm to people, property and environment?
46. Establishing tolerable risk criteria
Examples are:
• Probable Loss of Life (PLL): Number of
fatalities × frequency of event
• Fatal accident rate (FAR): Number of
fatalities per 108 h worked at the site
where the hazard is present.
48. Tolerable risk conclusion
The indications are that many companies determine
tolerable risk targets using consensus from the types of
statistics we have been looking at. Marzal concluded that
the range of PLL values in industry is still a wide one
from 10−3 to 10−6 for the upper level.
We must also remember to allow for the effect of
multiple hazard sources. It appears that financial cost
benefit analysis often justifies greater risk reduction
factors than the personal or environmental risk criteria.
We shall revisit this issue when we come to safety
integrity level (SIL) determination practices later in this
course.
49. Practical exercise
Now is good time to try practical Exercise
No. 1, which is set out towards the back of
the manual in module 12. This exercise
demonstrates the calculation of individual
risk and FAR, and uses these parameters
to determine the minimum risk reduction
requirements.
50. Hazard analysis techniques
In the European Standard EN 1050 Annex B
there are descriptions of several techniques for
hazard analysis.
The notes there make an important distinction
between two basic approaches. These are
called deductive and inductive. This is how the
standard describes them:
‘In the deductive method the final event is
assumed and the events that could cause this
final event are then sought.
51. Summary of hazard-identification
methods
Here is a summary of the hazard-identification methods.
It is useful to have this list because many companies will
have preferences for certain methods or will present
situations that require a particular approach. We need to
have a choice of tools for the job and to be aware of their
pros and cons. It is also apparent that similar methods
will have a variety of names.
All guides agree that Hazop provides the most
comprehensive and auditable method for identification of
hazards in process plants but that some types of
equipments will be better served by the alternatives
listed here.
52. Deductive method
A good example of a deductive method is Fault tree
analysis or FTA. The technique begins with a top event
that would normally be a hazardous event. Then all
combinations of individual failures or actions that can
lead to the event are mapped out in a fault tree. This
provides a valuable method of showing all possibilities in
one diagram and allows the probabilities of the event to
be estimated.
Deductive methods are useful for identifying hazards at
earlier stages of a design project where major hazards
such as fire or explosion can be tested for feasibility at
each section of plant. It’s like a cause and effect diagram
where you start with the effect and search for causes.
53. Inductive method
So-called ‘what if’ methods are inductive
because the questions are formulated and
answered to evaluate the effects of component
failures or procedural errors on the operability
and safety of the plant or a machine. For
example, ‘What if the flow in the pipe stops?’
This category includes:
• Failure Mode and Effects Analysis or FMEA
• Hazop studies
• Machinery concept hazard analysis (MHCA).
54.
55.
56.
57.
58.
59.
60. Rating for Safety
The following expression defines the
relationship between safety Availability
and PFD:
Safety Availability = 1 – PFD
It often may be desirable to express the
SIL level in terms of the hazard reduction
factor, where HRF is defined as: HRF = 1 /
PFD
63. Linking Risks to SIL
To determine the application of a SIS for
an actual installation, the control engineer
should use a qualitative classification of
risk assessment.
A qualitative evaluation of safety integrity
level weighs the severity and likelihood of
the hazardous event. It also considers the
number of independent protection layers
addressing the same cause of a
hazardous event.
64. Safety Integrity Level (SIL)
During the 1990s the concept of safety-integrity
levels (known as SILs) evolved and is used in
the majority of documents in this area. The
concept is to divide the ‘spectrum’ of integrity
into a number of discrete levels (usually four)
and then to lay down requirements for each
level.
Clearly, the higher the SIL then the more
stringent become the requirements.
66. To further understand these important terms let us ask a fundamental
question which is how frequently will failures of either type of function
lead to accidents. The answer is different for the 2 types:
For functions with a low demand rate, the accident rate is a combination
of 2 parameters
i) the frequency of demands, and ii) the probability the function fails on
demand (PFD).
In this case, therefore, the appropriate measure of performance of the
function is PFD, or its reciprocal, Risk Reduction Factor (RRF).
For functions which have a high demand rate or operate continuously,
the accident rate Page 32 of 189 is the failure rate, λ, which is the
appropriate measure of performance. An alternative measure is mean
time to failure (MTTF) of the function. Provided failures are exponentially
distributed, MTTF is the reciprocal of λ.
These performance measures are, of course, related. At its simplest,
provided the function can be proof-tested at a frequency which is greater
than the demand rate, the relationship can be expressed as:
PFD = λT/2 or = T/(2 x MTTF), or
RRF = 2/(λT) or = (2 x MTTF)/T
67. Definitions of SILs for Low Demand Mode from BS
EN 61508
Definitions of SILs for High Demand / Continuous
Mode from BS EN 61508
68. So what is the SIL achieved by the function? Clearly it is
not unique, but depends on the hazard and in particular
whether the demand rate for the hazard implies low or high
demand mode.
SIL is a measure of the SIS performance related only to the
devices that comprise the SIS. This measure is limited to
device integrity, architecture, testing, diagnostics, and
common mode faults inherent to the specific SIS design. It
is not explicitly related to a cause-and-effect matrix, but it is
related to the devices used to prevent a specific incident.
Further, SIL is not a property of a specific device. It is a
system property; input devices through logic solver to
output devices.
Finally, SIL is not a measure of incident frequency. It is
defined as the probability (of the SIS) to fail on demand
(PFD). A demand occurs whenever the process reaches
the trip condition and causes the SIS to take action.
69. The new ANSI/ISA S84.01 standard requires that assign a
target safety integrity level (SIL) for all safety
instrumented systems (SIS) applications.
The assignment of the target SIL is a decision requiring the
extension of the process hazards analysis (PHA)
process to include the balance of risk likelihood and
severity with risk tolerance.
Since SIL 4 is rarely used. SIL 3 is typically the highest
specified safety level. Of the three commonly used
levels, SIL3 has the greatest safety availability (RSA),
and therefore the lowest average probability of failure on
demand (PFD). Required Safety Availability (RSA) is the
fraction of time that a safety system is able to perform its
designated safety function when the process is
operating.
70. A determination of the target safety
integrity level requires:
1. An identification of the hazard involved.
2. Assessment of the risk of each of the
identified hazard. In other words, how
bad is each
hazard and how often is it expected to
occur.
3. An assessment of other Independent
Protection Layers (IPLs) that may be in
place.
71. Risk Level Factors Based On Frequency
Risk Level Factors Based On Severity
72. Safety Architectures
Several system architectures are applied in
process safety applications, including
single-channel systems to triple redundant
configurations. Control engineers must
best match architecture to operating
process safety requirements, accounting
for failure in the safety system.
73. One concern is that many safety systems in
operation, or under construction, do not follow
basic protection principles. Unsafe practices
include:
• Performing the safety shutdown within the basic
process control systems (BPCS) or distributed
control systems (DCS).
• Using conventional programmable logic
controllers (PLCs) in safety critical applications
(Safety PLCs) are certified to meet safety critical
applications to SIL2 and SIL3.)
• Implementing single element (non redundant)
microprocessor- based systems on critical
processor.
74. The conventional PLC architecture
provides only a single electric path.
Sensors send process
signals to the input modules. The logic solver evaluates
these inputs, determines if a potentially hazardous
condition exists, and energizes or de-energizes the solid-
state output. (Fire and gas detection systems, for
example, use the “energized to trip” philosophy.)
Suppose the safety system de-energizes the output to
move the process to a safe state. Suppose also that one
of the components in the single path fails so that the
output cannot be de-energized. Then the conventional
PLC won’t provide its desired safety protection function.
75. A special class of programmable logic controllers,
called safety PLCs, represents an alternative.
Safety PLCs provide high reliability and high
safety via special electronics, special software,
pre-engineered redundancy, and independent
certification.
The safety PLC has input/output circuits designed
to be fail-safe, using built-in diagnostics. The
central processing unit (CPU) of a safety PLC
has built-in diagnostics for memory, CPU
operation, watchdog timer, and communication
systems.
76. • Accurately evaluating the safety level for a specific
control device in the context of a potential hazardous
event poses a major and difficult problem for many
control engineers. Associations and agencies
worldwide have made considerable progress toward
establishing standards and implementation guidelines
for safety instrumented systems. These standards
attempt to match the risk inherent in a given situation
to the required integrity level of the safety system.
• Unfortunately, many of these guidelines and
standards are not specific to a particular type of
process and deal only with a qualitative level of risk.
Control engineers must use considerable judgment in
evaluating risk and applying instrumentation that
properly addresses established design procedures
with budget restraints.
77. Typical Applications
A fault-tolerant control system identifies and compensates
for failed control system elements and allows repair
while continuing assigned task without process
interruption. A high integrityn control system is used in
critical process applications that require a significant
degree of safety and availability. Some typical
applications are:
1- Emergency Shutdown
2- Boiler Flame Safety
3- Turbine Control Systems
4- Offshore Fire and Gas Protection
78. 1- Emergency Shutdown
Safety instrumented system provides continuous protection for safety-
critical units in refineries, petrochemical/chemical plants and other
industrial processes. For example, in reactor and compressor units,
plant trip signals – for pressure, product feed rates, expander
pressures equalization and temperature – are monitored and
shutdown actions taken if an upset condition occur.
Traditional shutdown systems implemented with mechanical or
electronic relays provide shutdown protection but can also cause
dangerous nuisance trips. Safety instruments provide automatic
detection and verification of field sensor integrity, integrated
shutdown and control functionality, and direct connection to the
supervisory data highway for continuous monitoring of safety –
critical functions.
79. 2- Boiler Flame Safety
Process steam boilers function as a critical component in
most refinery applications. Protection of the boiler from
upset conditions, safety interlock for normal startup and
shutdown, and flamesafety applications are combined by
one integrated safety instrument system.
In traditional applications, these functions had to be
provided by separate, non-integrated components. But
with fault – tolerant, fail – safe integrated controller, The
boiler operations staff can use a critical resource more
productively while maintaining safety at or above the
level of electromechanically protection systems.
80. 3- Turbine Control Systems
The control and protection of gas or steam turbines
requires high integrity as well as safety. The continuous
operation of the fault – tolerant integrated controller
provides the turbine operator with maximum availability
while maintaining equivalent levels of safety.
Speed control as well as start-up and shutdown sequencing
are implemented in a single integrated system.
Unscheduled outages are avoided by using hot spares
for the I/O modules. If a fault occurs in a module, a
replacement module is automatically activated without
operator intervention.
81. 4- Offshore Fire and Gas
Protection
The protection of offshore platforms from fire and gas
threats requires continuous availability as well as
reliability. The safety instrument system provides this
availability through online replacement of faulty modules;
field wiring and sensors are managed automatically by
built-in diagnostics.
Analog fire and gas detectors are connected directly to the
controller, eliminating the need for trip amps. An operator
interface monitors fire and gas systems as well as
diagnostics for the controller and its attached sensors.
Traditional fire and gas panels can be replaced with a
single integrated system, saving costly floor space while
maintaining high levels of safety and availability.