PuppetConf 2017: The Five Dirty Words of CI- J. Paul Reed, Release Engineering Approach

The Five
Dirty
Words
of CI
J. Paul Reed
Release Engineering Approaches
PuppetConf, 2017

“CI”
@jpaulreed #PuppetConf

J. Paul Reed
@jpaulreed on
Alum of The Ship Show
15+ Years in Build/Release
Engineering
Now, a DevOps™ Consultant™
Master of Science candidate in
Human Factors & System
Safety

Root
Cause Analysis
Dirty Word #1

Root Cause Analysis
A method of problem solving used for
identifying the root causes of
faults or problems.
A factor is considered a root cause if
removal thereof from the problem-fault-
sequence prevents the ﬁnal undesirable
event from recurring.
— Wikipedia

Our
Perception

Our Reality@jpaulreed #PuppetConf

“We found the Root Cause!”

“Root” “Cause”
Cause is something
you construct.
What you call “root cause”
is simply the place where you
stop looking any further.
— Sidney Dekker

A Better Choice: “Root Cause Analysis”
Proximate
Cause(s)
But…

The
“Five Whys”
Dirty Word #2

The “Five Whys”
Five Whys is an iterative interrogative
technique used to explore the
cause-and-eﬀect relationships
underlying a particular problem.
The primary goal of the technique is to
determine the root cause of a defect or
problem by repeating the question "Why?"
— Wikipedia

What “Five Whys” Always Feels Like to Me

The Perception:
Incidents are
deterministic,
like code:
same inputs,
same outputs.
Every time.

A Crash Where?!
int foo(object& r)
{
r.Blah();
return 1;
}

The Operational Reality@jpaulreed #PuppetConf

A Better Choice: “Five Whys”
Just… no.
“Swiss Cheese” model
Systemic model

Human
Error
Dirty Word #3

Human Error
Human error has been cited as a primary cause
or contributing factor in disasters and accidents
in industries as diverse as nuclear power,
aviation, space exploration, and medicine.
Prevention of human error is generally seen as a
major contributor to reliability and safety of
(complex) systems.
— Wikipedia

But Really: What Is Human Error?
— James Reason’s conception@jpaulreed #PuppetConf

Who gets to draw “the
line?”
What incentives/interests
do they have in putting
that “line” where it is?
It ignores other stories or
even the possibility of
entertaining other
explanations…
Isues with
“Human” “Error”

“Human” “Error”
Human error is not the cause of
failure, but the eﬀect.
So, human error… can never be the
conclusion of your investigation.
It is the starting point.
— Sidney Dekker

Human Error
often a prelude to
a constraint on
learning:
“Well, just ﬁre the
dumb,
bad apples…
problem solved!”

Were the World So Simple…

A Tiny Problem in “The Cloud”

A Different Take on Failure

A Huge Opportunity to Learn
Other operational tools with no input sanity checks
The Service Health Dashboard’s real dependencies
Indexing Subsystem’s insufﬁcient partitioning
Indexing Subsystem hadn’t been fully restarted for years
Had Amazon “just ﬁred” this engineer,
they would have never learned these critical details
about their system or how to operate it

A Better Choice: Human Error
Stop Saying It.
Then, Keep
Not-Saying It.

“Why didn’t you?”/
“You should
have…”
Dirty Word #4

Counterfactuals
Dirty Word #4

Counterfactuals
Counterfactual thinking is a concept in psychology that
involves the human tendency to create possible
alternatives to life events that have already occurred.…
Counterfactual thinking is, as it states, "counter to the
facts." These thoughts consist of the "What if?" and
the "If I had only..." that occur when thinking of how
things could have turned out diﬀerently.
— Wikipedia

A Waste of Time

Discussing a
reality that
does not exist.

A Better Choice: Counterfactuals
Don’t.

Best
Practice
Dirty Word #5

Best Practice
A best practice is a method or technique
that has been generally accepted as
superior to any alternatives because it
produces results that are superior to
those achieved by other means or
because it has become a standard way of
doing things.
— Wikipedia

Only the Best of Practices

Only the Best, Artisanal of Practices
“Best” is superlative
Best practices in complex systems often
ignore context
Best practices are often not completely
deﬁned (especially in complex systems)

“Best practice,” by deﬁnition, leaves no
space for innovation / discovery
Relatedly (and maybe worst of all) it
discourages experimentation

Best Practicin’ Ourselves Outta Business

“Best practice,” by deﬁnition, leaves no
space for innovation / discovery
Relatedly (and maybe worst of all) it
discourages experimentation
Best practice applies to a domain little of
our work exists in

Obvious
ComplicatedComplex
Chaotic
Disorder

Obvious
ComplicatedComplex
Chaotic
Governing
Constraints
Good Practice
Rigid Constraints
Best Practice
Lack of Constraints
Novel Practice
Enabling Constraints
Emergent Practice
Disorder

A Better Choice: Best Practice
“Good Practice”
Or ensure you apply
“best practice”
in the correct domain

“Yeeahhh... so what kind of…
Continuous Improvement...
would you... say ya do here?”@jpaulreed #PuppetConf

The Path of
Continuous Improvement
is not “linear”
(Nor is it “one-and-done”)
Takeaway I

Respect
Reality
Takeaway II

Treat People Like
the Professionals
They Are
Takeaway III

Go Forth and
Continuously Improve
J. Paul Reed
www.jpaulreed.com
@jpaulreed
www.release-approaches.com
Simply Ship. EveryTime.@jpaulreed #PuppetConf

Broken
Build
Dirty Word #1

“Flappers”
Dirty Word #2

bobs-mac-mini.
local
Dirty Word #3

Merge
Window
Dirty Word #4

Jenkins
Build Number
Dirty Word #5

PuppetConf 2017: The Five Dirty Words of CI- J. Paul Reed, Release Engineering Approach

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie PuppetConf 2017: The Five Dirty Words of CI- J. Paul Reed, Release Engineering Approach

Ähnlich wie PuppetConf 2017: The Five Dirty Words of CI- J. Paul Reed, Release Engineering Approach (20)

Mehr von Puppet

Mehr von Puppet (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

PuppetConf 2017: The Five Dirty Words of CI- J. Paul Reed, Release Engineering Approach