"Blameless postmortems" and "learning from failure" are very en vogue in the technology industry right now. Both fall into that less-discussed category of "CI": Continuous Improvement. But for as much as we all talk about them, in many organizations and teams, the outcome of continual organizational learning and improvement remains elusive. Why is this? In this talk, we'll look at five "dirty words"* that are often thrown around during postmortems, retrospectives, and other learning exercises that not only make it difficult for teams to discuss learning, but promote activities and behaviors that are actually counterproductive to continuous improvement. We'll dig into the existing research on why this is — it turns out we're not the only industry struggling with this! — and look at some different language we can start using that can more ably facilitate sustainable Continuous Improvement in our work environments. *Not actually dirty words.
5. J. Paul Reed
@jpaulreed on
Alum of The Ship Show
15+ Years in Build/Release
Engineering
Now, a DevOps™ Consultant™
Master of Science candidate in
Human Factors & System
Safety
@jpaulreed #PuppetConf
8. Root Cause Analysis
A method of problem solving used for
identifying the root causes of
faults or problems.
A factor is considered a root cause if
removal thereof from the problem-fault-
sequence prevents the final undesirable
event from recurring.
— Wikipedia
@jpaulreed #PuppetConf
12. “Root” “Cause”
Cause is something
you construct.
What you call “root cause”
is simply the place where you
stop looking any further.
— Sidney Dekker
@jpaulreed #PuppetConf
13. A Better Choice: “Root Cause Analysis”
Proximate
Cause(s)
But…
@jpaulreed #PuppetConf
15. The “Five Whys”
Five Whys is an iterative interrogative
technique used to explore the
cause-and-effect relationships
underlying a particular problem.
The primary goal of the technique is to
determine the root cause of a defect or
problem by repeating the question "Why?"
— Wikipedia
@jpaulreed #PuppetConf
22. Human Error
Human error has been cited as a primary cause
or contributing factor in disasters and accidents
in industries as diverse as nuclear power,
aviation, space exploration, and medicine.
Prevention of human error is generally seen as a
major contributor to reliability and safety of
(complex) systems.
— Wikipedia
@jpaulreed #PuppetConf
23. But Really: What Is Human Error?
— James Reason’s conception@jpaulreed #PuppetConf
24. Who gets to draw “the
line?”
What incentives/interests
do they have in putting
that “line” where it is?
It ignores other stories or
even the possibility of
entertaining other
explanations…
Isues with
“Human” “Error”
@jpaulreed #PuppetConf
25. “Human” “Error”
Human error is not the cause of
failure, but the effect.
So, human error… can never be the
conclusion of your investigation.
It is the starting point.
— Sidney Dekker
@jpaulreed #PuppetConf
26. Human Error
often a prelude to
a constraint on
learning:
“Well, just fire the
dumb,
bad apples…
problem solved!”
@jpaulreed #PuppetConf
30. A Huge Opportunity to Learn
Other operational tools with no input sanity checks
The Service Health Dashboard’s real dependencies
Indexing Subsystem’s insufficient partitioning
Indexing Subsystem hadn’t been fully restarted for years
Had Amazon “just fired” this engineer,
they would have never learned these critical details
about their system or how to operate it
@jpaulreed #PuppetConf
31. A Better Choice: Human Error
Stop Saying It.
Then, Keep
Not-Saying It.
@jpaulreed #PuppetConf
34. Counterfactuals
Counterfactual thinking is a concept in psychology that
involves the human tendency to create possible
alternatives to life events that have already occurred.…
Counterfactual thinking is, as it states, "counter to the
facts." These thoughts consist of the "What if?" and
the "If I had only..." that occur when thinking of how
things could have turned out differently.
— Wikipedia
@jpaulreed #PuppetConf
40. Best Practice
A best practice is a method or technique
that has been generally accepted as
superior to any alternatives because it
produces results that are superior to
those achieved by other means or
because it has become a standard way of
doing things.
— Wikipedia
@jpaulreed #PuppetConf
42. Only the Best, Artisanal of Practices
“Best” is superlative
Best practices in complex systems often
ignore context
Best practices are often not completely
defined (especially in complex systems)
@jpaulreed #PuppetConf
43. Only the Best, Artisanal of Practices
“Best practice,” by definition, leaves no
space for innovation / discovery
Relatedly (and maybe worst of all) it
discourages experimentation
@jpaulreed #PuppetConf
45. Only the Best, Artisanal of Practices
“Best practice,” by definition, leaves no
space for innovation / discovery
Relatedly (and maybe worst of all) it
discourages experimentation
Best practice applies to a domain little of
our work exists in
@jpaulreed #PuppetConf
55. Go Forth and
Continuously Improve
J. Paul Reed
www.jpaulreed.com
@jpaulreed
www.release-approaches.com
Simply Ship. EveryTime.@jpaulreed #PuppetConf