5. Apollo 1 Fire
• January 27, 1967, the crew of Apollo 1,
climbed into the crew module for a plugs-
out test, which was not expected to be
hazardous.
• The module was pressurized to 16 psia,
higher than ambient, and was 100%
oxygen, which the contractor recommended
against.
6. Apollo 1 Fire cont.
• The crew module had a number of known but
uncorrected flaws and the crew had expressed
concern about fire hazards.
• The astronauts had also lobbied successfully
for an outward opening door, but that design
change was not incorporated here.
• Still, flawed or not, the hope was to
successfully pass the test today and launch it
three weeks later in February.
7. Apollo 1 Fire cont.
• At 6:31:07, *before the test had even started*,
the first cry of fire came from the cabin.
• For about 10 seconds, one could hear frantic
movements followed by Chafee yelling,
“We’ve got a bad fire! Let’s get out! We’re
burning up! We’re on fire! Get us out of here!”
Then, a scream of pain and the end of the
transmission, seventeen seconds after the
first report of fire.
• The crew module ruptured from the pressure
and toxic black smoke poured from the
module.
8. Apollo 1 Fire cont.
• It took another eight minutes before they could
open the hatch, by which time the fire had gone
out. It took 7.5 hours to remove the crews
remains, as they were fused in place by the
melted nylon of their suits. It was not a fun way
to die.
• In the end, a number of key factors were called
out as potential causes and contributors. The
high pressure oxygen environment was very
dangerous from a flammability standpoint (”in
which a bar of aluminum can burn like wood”).
9. Apollo 1 Fire cont.
• There was a wealth of off-gassing flammable
nonmetallics like nylon and velcro. Wiring and
plumbing was substandard (note that 1407
wiring *design* problems were corrected after
Apollo 1) with a stripped and abraded wire
near a leaky coolant line (a potential
exothermic explosion) but just the static
electricity from their suits were found sufficient
to have started a fire in that atmosphere. We
were not short of smoking guns and no single
cause was ever determined as *the* cause.
10. Apollo 1 Fire cont.
• We were reckless, we were sloppy, and
we thought that the success with Mercury
and Gemini at 100% oxygen made us
bulletproof. Astronauts Edward H. White II,
Virgil I. Grissom, and Roger B. Chaffee
paid the price.
11. Apollo 1 Fire cont.
• Nonmetallics are given careful consideration
before flight, requiring both toxicity and
flammability off-gassing tests (if not a previously
flown material).
• Even the simplest ground tests are done with
emergency personnel on site, with procedures
for rescuing test subjects practiced and in hand,
a thorough safety review before proceeding.
• We fly with an air mixture (except in the suits)
and wiring and materials are held to very high
standards. Materials used, particular “on” the
crewmembers must be self-extinguishing.
14. Challenger
• The first time it blew up, it was such a
shock, because most people thought it
would never ever happen. But once you
get the idea that spacecraft sometimes
have catastrophic events, then it becomes
less of a shock.
15. • January 28, 1986, the
shuttle Challenger
explodes 73 seconds
into its launch, killing
all seven crew
members
• Investigation reveals
that a solid rocket
booster (SRB) joint
failed, allowing flames
to impinge on the
external fuel tank
16. Challenger…
• Liquid hydrogen tank explodes, ruptures
liquid oxygen tank
• Resulting massive explosion destroys
the shuttle
17. The Legacy of Challenger
• The Rogers Commission,
which investigated the incident,
determined:
– The SRB joint failed when jet flames burned
through both o-rings in the joint
– NASA had long known about recurrent
damage to o-rings
– Increasing levels of o-ring damage had been
tolerated over time
• Based upon the rationale that “nothing bad
has happened yet”
18. The Legacy… continued
• The Commission also determined that:
– SRB experts had expressed concerns about the
safety of the Challenger launch
– NASA’s culture prevented these concerns from
reaching top decision-makers
– Past successes had created an environment of
over-confidence within NASA
– Extreme pressures to maintain launch schedules
may have prompted flawed decision-making
• The Commission’s recommendations
addressed an number of organizational,
communications, and safety oversight issues
19. Columbia FEB 1, 2003 8:59 EST
Space shuttle Columbia,
re-entering Earth’s
atmosphere at 10,000
mph, disintegrates
– All 7 astronauts are killed
– $4 billion spacecraft is
destroyed
– Debris scattered over 2000
sq-miles of Texas
– NASA grounds shuttle fleet
for 2-1/2 years
20. Columbia- The Physical Cause
• Insulating foam separates
from external tank 81
seconds after lift-off
• Foam strikes underside of
left wing, breaches thermal
protection system (TPS) tiles
• Superheated air enters wing
during re-entry, melting
aluminum struts
• Aerodynamic stresses
destroy weakened wing
21. A Flawed Decision Process
• Foam strike detected in
launch videos on Day 2
• Engineers requested
inspection by crew or
remote photo imagery
to check for damage
• Mission managers
discounted foam strike
significance
• No actions were taken to
confirm shuttle integrity or
prepare contingency plans
22. Columbia- The Organizational Causes
• NASA had received painful
lessons about its culture from the
Challenger incident
• CAIB found disturbing parallels
remaining at the time of the
Columbia incident… these are
the topic of this presentation
“In our view, the NASA
organizational culture had as
much to do with this accident as
the foam.”
CAIB Report, Vol. 1, p. 97
23. Columbia Key Issues
• With little corroboration, management had become
convinced that a foam strike was not, and could not be,
a concern.
• Why were serious concerns about the integrity of the
shuttle, raised by experts within one day after the
launch, not acted upon in the two weeks prior to return?
• Why had NASA not learned from the lessons of
Challenger?
24. Key Organizational Culture Findings
– What NASA Did Not Do
1. Maintain Sense Of Vulnerability
2. Combat Normalization Of Deviance
3. Establish an Imperative for Safety
4. Perform Valid/Timely Hazard/Risk Assessments
5. Ensure Open and Frank Communications
6. Learn and Advance the Culture
25. Maintaining a Sense of
Vulnerability
“Let me assure you that, as of
yesterday afternoon, the Shuttle
was in excellent shape, … there
were no major debris system
problems identified….”
NASA official on Day 8
“The Shuttle has become a
mature and reliable system …
about as safe as today’s
technology will provide.”
NASA official in 1995
26. Maintaining a Sense of
Vulnerability
• NASA’s successes (Apollo program, et al)
had created
a “can do” attitude that minimized the
consideration
of failure
• Near-misses were regarded as successes
of a robust system rather than near-failures
– No disasters had resulted from prior foam strikes,
so strikes were no longer a safety-of-flight issue
– Challenger parallel… failure of the primary o-ring
demonstrated the adequacy of the secondary o-ring
to seal the joint
27. Combating Normalization of
Deviance
• After 113 shuttle missions,
foam shedding, debris
impacts, and TPS tile
damage came to be
regarded as only a routine
maintenance concern
“…No debris shall emanate
from the critical zone of the
External Tank on the launch
pad or during ascent…”
Ground System Specification
Book – Shuttle Design
Requirements
28. Combating Normalization of
Deviance
• Each successful mission reinforced the perception that foam
shedding was unavoidable…either unlikely to jeopardize safety
or an acceptable risk
Foam shedding, which violated the shuttle
design basis, had been normalized
Challenger parallel… tolerance of damage to
the primary o-ring… led to tolerance of failure
of the primary o-ring… which led to the
tolerance of damage to the secondary o-
ring… which led to disaster
29. Establish An Imperative for Safety
• The shuttle safety organization, funded by
the programs it was to oversee, was not
positioned to provide independent safety
analysis
• The technical staff for both Challenger and
Columbia were put in the position of having
to prove that management’s intentions
were unsafe
“When I ask for the budget to be
– This reversed their normal role of having to
cut,
prove I’m told it’s going to impact safety
on
mission safety the Space Shuttle … I think that’s a
bunch of crap.”
30. Establish An Imperative for
Safety
As with Challenger, future
NASA funding required
meeting an ambitious launch
schedule
– Conditions/checks, once
“critical,” were now waived
–A significant foam strike on Desktop screensaver at
a recent mission was not NASA
resolved prior to
Columbia’s launch
International
Space
–Priorities conflicted… and Station
production won over safety deadline
19 Feb 04
31. Perform Valid/Timely
Hazard/Risk Assessments
• NASA lacked consistent, structured
approaches for identifying hazards and
assessing risks
• Many analyses were subjective, and
many action items from studies were not
addressed
• more activity today risk tile damage or are people
“AnyIn lieu of properon the assessments, manyjust
relegated to crossing their fingers and hoping for the best?”
identified concerns were simply labeled
Email Exchange at NASA
as “acceptable”
“… hazard analysis processes are applied inconsistently across
• Invalid computer modeling of the
systems, subsystems, assemblies, and components.” foam
CAIB Report, Vol. 1, p. 188
32. Ensure Open and Frank Communications
• Management adopted a uniform mindset
that foam strikes were not a concern and
was not open to contrary opinions.
• The organizational culture
– Did not encourage “bad news”
– Encouraged 100% consensus
– Emphasized only “chain of command” communications
– Allowed rank and status to trump expertise
I must emphasize (again) that severe enough
damage… could present potentially grave
hazards… Remember the NASA safety posters
everywhere around stating, “If it’s not safe, say
so”? Yes, it’s that serious.
Memo that was composed but never sent
33. Ensure Open and Frank Communications
• Lateral communications between some
NASA sites were also dysfunctional
– Technical experts conducted considerable
analysis of the situation, sharing opinions
within their own groups, but this information
was not shared between organizations
within NASA
– As similar point was addressed by the
Rogers Commission on the Challenger
34. Learn and Advance the
Culture
• CAIB determined that NASA had not
learned from the lessons of Challenger
• Communications problems still existed
– Experts with divergent opinions still had difficulty getting heard
• Normalization of deviance was still
occurring
• Schedules often still dominated over
safety concerns
• Hazard/risk assessments were still
35. … An Epilog
• Shuttle Discovery was
launched on 7/26/05
• NASA had formed an
independent Return To
Flight (RTF) panel to
monitor its preparations
• 7 of the 26 RTF panel
members issued a
minority report prior to
the launch
36. … An Epilog
• During launch, a large piece of foam
separated from the external fuel tank, but
fortunately did not strike the shuttle, which
landed safely 14 days later
• The shuttle fleet was once again
grounded, pending resolution of the
problem with the external fuel tank
insulating foam
37. …NOT Ensuring Open and
Frank Communications
• The bearer of “bad news” is viewed as
“not a team player”
• Safety-related questioning “rewarded” by
requiring the suggested to prove he / she
is correct
• Communications get altered, with the
message softened, as they move up or
down the management chain
• Safety-critical information is not moving
laterally between work groups
38. …NOT Learning and Advancing
the Culture
• Recurrent problems are not investigated,
trended, and resolved
• Investigations reveal the same causes
recurring time and again
• Staff expresses concerns that standards of
performance are eroding
• Concepts, once regarded as
organizational values, are now subject to
expedient reconsideration
39. “Engineering By View Graph”
• The CAIB faulted shuttle project staff
for trying to summarize too much
important information on too few
PowerPoint slides
• We risk the same criticism here
• This presentation introduces the
concept of organizational
“When engineering analyses and risk assessments are
effectiveness and safety culture, as
condensed to fit on a standard form or overhead slide,
information is inevitably lost… the priority assigned to
exemplified bymisrepresented by its placement on a
the case studies
information can be easily
chartpresented
and the language that is used.”