Keynote Presentation by Damon Edwards at JAX DevOps London, April 4, 2017.
We all love the aspirational DevOps talks about companies achieving blistering speed and dazzling nimbleness. But improving your own organization’s performance – from where they are now to performance levels equal to the industry leaders – seems like a very long and difficult road. What is missing in most organizations? A repeatable system that empowers teams to find and fix their own problems.
This is a prescriptive talk about empowering and transforming organizations using a methodical — and totally reasonable – Kaizen (Continuous Improvement) approach. We’ll look at ways of combining known techniques like value stream mapping, Lean waste analysis, and improvement kata in order to fix organizational, process, and tooling problems. This talk isn’t about mythical silver bullets or vague philosophies. This talk is about taking a fresh look at proven Lean techniques that already work in high-performing IT organizations
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
8. Why are so many organizations unable to improve?
9. 1. The work isn’t visible
Why are so many organizations unable to improve?
10. 1. The work isn’t visible
2. People are working out of context
Why are so many organizations unable to improve?
11. 1. The work isn’t visible
2. People are working out of context
3. Inertia is pulling your org out of alignment
Why are so many organizations unable to improve?
22. Enterprises naturally trend towards silos
Dev / Test Release OperatePlanning
Application Knowledge
Handoff
!
Handoff
!
Handoff
!
23. Enterprises naturally trend towards silos
Dev / Test Release OperatePlanning
Application Knowledge
Operational Knowledge
Handoff
!
Handoff
!
Handoff
!
24. Enterprises naturally trend towards silos
Dev / Test Release OperatePlanning
Application Knowledge
Operational Knowledge
Business Intent
Handoff
!
Handoff
!
Handoff
!
25. Enterprises naturally trend towards silos
Dev / Test Release OperatePlanning
Application Knowledge
Operational Knowledge
Business Intent
Handoff
!
Handoff
!
Handoff
!
Ownership
but limited
Accountability
26. Enterprises naturally trend towards silos
Dev / Test Release OperatePlanning
Application Knowledge
Operational Knowledge
Business Intent
Handoff
!
Handoff
!
Handoff
!
Ownership
but limited
Accountability
Accountability
but no
Ownership
27. 1. The work isn’t visible
2. People are working out of context
3. Inertia is pulling your org out of alignment
++
Why are so many organizations unable to improve?
28. The only way to fix a sufficiently complex system is
to create the conditions for the system to fix itself.
33. Too costly…
outsource more!
Finance
More discipline…
tighter process and
more approvals
Change
Management
We need results…
re-org until we do!
Executive
Committee
“I know the answer!…”
34. Too costly…
outsource more!
Finance
More discipline…
tighter process and
more approvals
Change
Management
We need results…
re-org until we do!
Executive
Committee
Need better tools…
new automation and a
new network!!
Engineers
“I know the answer!…”
36. The “Big Bang” Transformation Reality
Start
Goal
Fear
Panic
Abort
Maybe
37. The “Big Bang” Transformation Reality
Start
Goal
Fear
Panic
Abort
Maybe
People revert to legacy
behaviors
38. The “Big Bang” Transformation Reality
Start
Goal
Fear
Panic
Abort
Maybe
People revert to legacy
behaviors
Declare “Done”
39. The “Big Bang” Transformation Reality
Start
Goal
Fear
Panic
Abort
Maybe
People revert to legacy
behaviors
Declare “Done”
40. More discipline…
tighter process and
more approvals
Need Results…
Re-Org!
Need better tools…
cool automation and a
new network!!
Too costly…
outsource more!
Finance
Change
Management
Executive
Committee
Engineers
41. More discipline…
tighter process and
more approvals
Need Results…
Re-Org!
Need better tools…
cool automation and a
new network!!
Too costly…
outsource more!
Finance
Change
Management
Executive
Committee
Engineers
42. “Little J’s” instead of “Big J”
Start
Finish
Start
Finish
“Big Bang” Continuous Improvement
Fear
Panic
Abort
Maybe
43. “Little J’s” instead of “Big J”
Start
Finish
Start
Finish
“Big Bang” Continuous Improvement
Fear
Panic
Abort
Maybe
44. “Little J’s” instead of “Big J”
Start
Finish
Start
Finish
“Big Bang” Continuous Improvement
Fear
Panic
Abort
Maybe
46. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
You are going to have to…
47. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
•Scale quickly
You are going to have to…
48. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
•Scale quickly
•Span multiple organizational boundaries
You are going to have to…
49. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
•Scale quickly
•Span multiple organizational boundaries
•Work with substantial numbers of legacy technologies
You are going to have to…
50. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
•Scale quickly
•Span multiple organizational boundaries
•Work with substantial numbers of legacy technologies
•Develop your existing staff in mass
You are going to have to…
51. Turn Continuous Improvement into an enterprise program
•Keep improvement efforts aligned
•Scale quickly
•Span multiple organizational boundaries
•Work with substantial numbers of legacy technologies
•Develop your existing staff in mass
•Be self-funding after initial seed investment
You are going to have to…
52. 1. The work isn’t visible
2. People are working out of context
3. Inertia is pulling your org out of alignment
But how do you do that when…
53. You need a systemic way to teach an organization to
find and fix what is getting in its own way.
56. “Kaizen”
• Kaizen: Japanese word for improvement
• Modern business context:
• Continuous improvement
• Systematic, scientific-method approach
• Total engagement of the workforce
• Valuing small changes as much as large changes
(outcome is what matters)
57. “Kaizen”
• Kaizen: Japanese word for improvement
• Modern business context:
• Continuous improvement
• Systematic, scientific-method approach
• Total engagement of the workforce
• Valuing small changes as much as large changes
(outcome is what matters)
• Kaizen in a DevOps context:
• Continuously improve the flow of work through the full
value stream in order to improve customer outcomes
58. Proven Lean Techniques
+
DevOps Context
“If I have seen further, it is by standing on the shoulders of giants.”
-Sir Isaac Newton
“DevOps Kaizen”
61. Organization-wide focus on service delivery metrics
• Lead Time (Duration and Predictability)
• MTTD (Mean Time To Detect)
• MTTR (Mean Time to Repair, Mean Time to Fix)
• Quality at the Source (Scrap/Rework)
OpsDev
Customer
Outcome
Business
Goal
64. Retrospectives are a per value stream tool
Value
Stream
A
Value
Stream
B
Value
Stream
C
Key:
“horizontal
thinking”
65. Map end-to-end process1
Include key process metrics:
Lead Time
Processing Time
Scrap Rate
Head Count
DevOps Kaizen: Retrospective Technique
66. Map end-to-end process1
Include key process metrics:
Lead Time
Processing Time
Scrap Rate
Head Count
DevOps Kaizen: Retrospective Technique
Note: “going to the
gemba” requires
making it visible
together
67. Map end-to-end process1
Include key process metrics:
Lead Time
Processing Time
Scrap Rate
Head Count
DevOps Kaizen: Retrospective Technique
Key: graphical
facilitation above
all else!
Note: “going to the
gemba” requires
making it visible
together
68. Map end-to-end process1
Include key process metrics:
Lead Time
Processing Time
Scrap Rate
Head Count
DevOps Kaizen: Retrospective Technique
Key: graphical
facilitation above
all else!
Note: “going to the
gemba” requires
making it visible
together
Inspiration: value stream mapping
69. Identify wastes, inefficiencies, bottlenecks
PD - Partially Done
TS - Task Switching
W - Waiting
M - Motion / Manual
D - Defects
EP - Extra Process
EF - Extra Features
HB - Heroics
Structured approach building on DevOps
adaptation of “7 deadly wastes” from Lean / Agile:
2
DevOps Kaizen: Retrospective Technique
70. Identify wastes, inefficiencies, bottlenecks
PD - Partially Done
TS - Task Switching
W - Waiting
M - Motion / Manual
D - Defects
EP - Extra Process
EF - Extra Features
HB - Heroics
Structured approach building on DevOps
adaptation of “7 deadly wastes” from Lean / Agile:
2
DevOps Kaizen: Retrospective Technique
Key: focus on
flow of value…
not gripes
71. Identify wastes, inefficiencies, bottlenecks
PD - Partially Done
TS - Task Switching
W - Waiting
M - Motion / Manual
D - Defects
EP - Extra Process
EF - Extra Features
HB - Heroics
Structured approach building on DevOps
adaptation of “7 deadly wastes” from Lean / Agile:
2
DevOps Kaizen: Retrospective Technique
Key: focus on
flow of value…
not gripes
Inspiration: 7 Wastes of Software Development
72. DevOps Kaizen: Retrospective Technique
Identify countermeasures
Countermeasures must be actionable, backlog ready.
Focus on short-term “baby steps”. Note broader, strategic
recommendations.
3
73. DevOps Kaizen: Retrospective Technique
Identify countermeasures
Countermeasures must be actionable, backlog ready.
Focus on short-term “baby steps”. Note broader, strategic
recommendations.
3
Key: “small j’s,
not big j’s”
77. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
78. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development D
In
D
fo
Scrum
Dev/QA
Business Trigger
Major Feature for Partner Markets
17 weeks 16 weeks
6 weeks H/C: 6 3 w
Requirements / Planning Dev / Test
Data Setup Inte
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
Service Verification test writing: shif
(test early)
Remove Bottleneck and Environme
(test more)
EP
D
TS
EP
M
D
PD M W
TS
1.
2.
3.
S/R - 90%
S/R - 71%
79. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
80. Leads
PMs
gn
ments
etail for their team
e
rd
Chief
chitect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
CA
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Jira
(if new arch)
16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks
Dev / Test Staging
Data Setup Integration Testing
4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
M
D
PD M W
TS
TS D
M TS
3.
4.
5.
6.
S/R - 90%
S/R - 55%
S/R - 32%
D
81. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
82. ire /
needed
a
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
(if new arch)
40 weeks total
16 weeks
weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Dev / Test Staging Release Prod Release
Data Setup Integration Testing
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
vice Verification test writing: shift left to Dev
(test early)
move Bottleneck and Environment Contention
(test more)
D
PD M W
TS
TS D
M TS
PD
M
M
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
83. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
84. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
+ Work in small batches
+ Early Ops Involvement
+ Standardized Catalog (with design
standards built-in)
85. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
+ Work in small batches
+ Early Ops Involvement
+ Standardized Catalog (with design
standards built-in)
+Dev provide service verification tests
+Ops provide environment verification
tests (used by Dev and QA)
+Service service test data setup
(including mainframe)
86. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
+ Work in small batches
+ Early Ops Involvement
+ Standardized Catalog (with design
standards built-in)
+Dev provide service verification tests
+Ops provide environment verification
tests (used by Dev and QA)
+Service service test data setup
(including mainframe)
+All deploys use Dev provided
service verification tests
+Deployment rehearsals using
service verification & environment
verification tests in all lower
environments
87. Customers
and Clients
Upcoming
Market Event
Chief
Strategy
Officer
Strategize
Product
Roadmap
Weekly Exec
Meeting
Ad-Hoc
P.M
"John"
Program
"Susan"
Product
Requirements
Defined
Jira
"Stories"
ARB
Queue
Track Ticket
Dependencies
Confluence
Pages
design
docs
Team Leads
and PMs
Assign
Requirements
more detail for their team
Architecture
Review Board
Chief
Architect
"Linda"
Working
Group
Ops
? (sometimes)
Devs, PM, Engr, QA
Development
Sprint
2 week c/t
Existing Dev
Environments
Acquire /
Prepare needed
data
DBA in
TechOps
Data Setup
"Jennifer"
Test Data
Configuration
Manager
Development Deploy to
Integration
Dev, QA
Integration &
Regression
Testing
focused on service
Scrum
Dev/QA
Integ03
Scrum
Dev/QA
Test
Link
Sprint
Review
Release
to Prod
Product Owners
(Using own
criteria)
Create
CAB ticket
or
Scrum Team Ops Team
(if legacy)
Push Deployment
to Stage
Stage
Email Notification
Jira
NewArch
Build
VMs
Jira
Ops
Service
Now
Legacy
QA Lead
PMs
QAs
End to end
testing in Prod
Prod Env
Prd
DB
Go-No Go
decision
meeting
Team Leads
Jira
Ops
By Cluster
"Remove
Feature Flag"
Business Trigger
Major Feature for Partner Markets
(if new arch)
40 weeks total
17 weeks 16 weeks
6 weeks H/C: 6 3 weeks H/C: 8
4 weeks H/C:8 3 weeks H/C: 14
Requirements / Planning Dev / Test Staging Release Prod Release
Data Setup Integration Testing
6. Service Aligned
Team Concept? 4. Write end-to-
end customer
func. tests
5. Resolve
interface to
legacy
1. Test data
setup
automation
3. Dev to Prod
for legacy
2. Unify change
management
tools
7. Tool
Service Verification test writing: shift left to Dev
(test early)
Remove Bottleneck and Environment Contention
(test more)
EP
D
TS
EP
M
D
PD M W
TS
TS D
M TS
PD
M
M
1.
2.
3.
4.
5.
6.
7.
8.
S/R - 90%
S/R - 55%
S/R - 32%
D
S/R - 43%
S/R - 71%
+ Work in small batches
+ Early Ops Involvement
+ Standardized Catalog (with design
standards built-in)
Key: “What can we do next?” NOT “what is nirvana?”
+Dev provide service verification tests
+Ops provide environment verification
tests (used by Dev and QA)
+Service service test data setup
(including mainframe)
+All deploys use Dev provided
service verification tests
+Deployment rehearsals using
service verification & environment
verification tests in all lower
environments
88. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
89. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
~2p
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update
"Scribes"
Notes
.doc
Update
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people
H
M
PD
TS
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people
H
M
PD
TS
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Stre
Communication i
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
every 4 hours
Request
90. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
91. Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
nability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
nability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
92. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
93. ProblemSubside
ProblemSpik
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubside
ProblemSpik
~2pm ~5am
1/7/16
"Scribes"
Notes
.doc Chat
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent
(assum
• Upgrade t
(~Q3 2016
• Rewrite D
to remove
procedure
s with any
"). "Fight
place!
s with any
"). "Fight
place!
3. Improved code
review
Also In-flight:
• Peer review for every chec
• Automation tooling and SD
DB changes (no more adho
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF IN
$1,012,000 ++
94. ProblemSubside
ProblemSpik
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubside
ProblemSpik
~2pm ~5am
1/7/16
"Scribes"
Notes
.doc Chat
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent
(assum
• Upgrade t
(~Q3 2016
• Rewrite D
to remove
procedure
s with any
"). "Fight
place!
s with any
"). "Fight
place!
3. Improved code
review
Also In-flight:
• Peer review for every chec
• Automation tooling and SD
DB changes (no more adho
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF IN
$1,012,000 ++
95. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
96. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
+ Automated service verification
and performance health checks
starting in Dev
+Ops added to code peer reviews
+Streamlined war rom and bridge
call communication
97. "MyAccount Perf Incident"
Customer
Call Contact
Center
!!!
Contact
Center
Call Service
Desk
Service
Desk
History:
12/20/15 - Started seeing problems
12/29/15 - Swarm of problems then everyone on vacation
9am Monday 1/4/16 - everyone back to work and problems spike
Try to
recreate
errors
Contact
Service Desk
Try to
recreate
errors
Check
Monitoring
BMC
"All green"
(up/down
monitoring)
TixTixTix
"stuff isn't
working!"
Many tickets
Many calls
Service
Now
Tickets
Start
Bridge
Callcall
someone
call
someone
else
Call
Vendor
Support
Start Biz
Management
Bridge
Assemble
War Room
"Terri"
Dir.
Account
Reps
Incident
Commander
NOC
2 hours
1/5/16
9 AM 11 AM
Web Dev
Fix
Test
Services
Fix
Test
Portal Ops
Fix
Test
DBA Ops
Fix
Test
Network Fix Test
Vendor
Partners
Fix
Test
Investigate
DDOS
(Security)
Test
Incident
Commander
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/6/16
Repeat
War
Room
Cycles
ProblemSubsidesforDay
ProblemSpikes!!
~2pm ~5am
1/7/16
Send Biz
Communication
Emails
Update BRM
Bridge Call
Update Update Update Update
"Scribes"
Notes
.doc Chat
Repeat
War
Room
Cycles
"Focused on
Stored
Procedures
but not
confident
why"
Repeat
War
Room
Cycles
"Vendor
consultant
provides
config
changes"
"Stored
Procedure s
used in way
never seen
before!"
• Vendor Consultant onsite
for 3 days
• QA tried to simulate load
• Tried adding capacity (3
app servers and web)
• Tried disabling
M-F
Every Day!
1/13/16
UpdateUpdate Update
QA Run Load
Tests to
Confirm Fix
(After Hours)
Prod
1/13/16-1/15/16
Additional
Fix from
Vendor
Consultant
Permanent Fix Options
(assumptions):
• Upgrade to latest version
(~Q3 2016 go live?)
• Rewrite DB layer of App
to remove stored
procedures
PD
No user experience monitoring in place
No standard application performance tools
available early and often
Limited database diagnostic tools
TS
Lots of teams spending
many cycles to "prove
it wasn't them"
TS
Too much noise on bridge.
Engineers can't focus.
Open up side bridges
because of difficulty of
noise / task switching on
main bridge
Vendor reps joining calls
aren't correct people
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
H
M
PD
TS
H
M
PD
TS
Can't trace user transactions through the stack
No controlled "prod-like" stage env for troubleshooting
Vendor provided "fix" is
really just a workaround
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
PD
TS
Inability to rollback recent changes with any
confidence ("probably impossible?"). "Fight
forward" only.
No service owner to manage biz
communication and determine if resolved
HBM PD
TS
Experimenting
directly in production
Documentation not
up to date
Lots of people coming in and out.
H
M
PD
TS
TS
H
M
PD
TS
H
M
PD
TS
No controlled "prod-like" stage env for troubleshooting
Resolution loop was
never really closed.
"Bandaid fix" is still in
place!
1. Service performance monitoring,
error detection, and automated
health checks starting Dev
6. Platform App Dev leading new
deployment strategy to improve
traceability and ease rollback
2. Improved vendor SLA
management
7. Improved / Streamlined
Communication in War Room / Bridge
3. Improved code
review
5. "Prod-Like" Pre-Prod
environments (with load
testing)
8. Follow through on
resolution
Also In-flight:
• Peer review for every check-in
• Automation tooling and SDLC for
DB changes (no more adhoc scripts)
War Room and Bridge Cost (direct labor):
M-F: 35 h/c per day for 7 days = $195,000
S,S: 6 h/c per day for 2 days = $11,400
Total: $206,400
Impact on Call Centers:
1500 Agents 30% degraded 8 hrs for
7 days = $806,000
Cost of impact/delay on other inflight projects:
Unknown ("feels large", estimated at 20% - 100%)
Cost of impact/delay on compliance issues:
Unknown
Cost of brand damage:
Unknown
TOTAL COST OF INCIDENT:
$1,012,000 ++
every 4 hours
Request
+ Automated service verification
and performance health checks
starting in Dev
+Ops added to code peer reviews
+Streamlined war rom and bridge
call communication
+ “Prod-like” testing environments
(with load testing)
+Automation and SDLC for
database changes
+Follow through on resolution to
achieve permanent fixes
98. Improvement Storyboards
Process Name Challenge/Key Pain
Target
Condition
Current
Condition
Improvement
Metrics
Work ToDo
(Baby Steps)
Blockers
Changes are being introduced
/ tested in production the first time
causing delays, rework, outages
Target Condition
Improvement Metrics
Work ToDo (Baby Steps)
Blockers
GTM/LTM (Traffic manager
configuration process)
• Apps are not developed in
production-like environments (not
testing F5 behavior)
• Ops teams cannot practice or learn
• App teams have no visibility into
constraints
• No remediation capabilities for app
support teams
• No repeatable pattern for GM health
activity
• 80% S/R with 2-3 rework cycles
• 50% cause outages
Current Condition
• GM/TLM functionality across all
SDLC environments (capex request
needed)
• change window reduction for non-
prod environments (turn those
around instantaneously less than 13
days)
• Provide read-only to all F5 consoles
• Standardize GM pattern
• Lead Time (post-dev to prod)
• Scrap Rate
• Acquire the F5 hardware or
software to support envs
throughout SDLC
• Make these changes L3 or 5
change requests
• Write automation scripts
• provide read only access to all
environments… can include API
access to facilitate automation
script writing
• Create design template with
customer pattern
◦ Financial approval (Jennifer)
◦ Segregation between environments
(Mark)
◦ Non-standard request types (Susan)
◦ Two network teams with different
rules (Mark)
Process Challenge/Key Pain
Template Example
99. Improvement Storyboards
Process Name Challenge/Key Pain
Target
Condition
Current
Condition
Improvement
Metrics
Work ToDo
(Baby Steps)
Blockers
Changes are being introduced
/ tested in production the first time
causing delays, rework, outages
Target Condition
Improvement Metrics
Work ToDo (Baby Steps)
Blockers
GTM/LTM (Traffic manager
configuration process)
• Apps are not developed in
production-like environments (not
testing F5 behavior)
• Ops teams cannot practice or learn
• App teams have no visibility into
constraints
• No remediation capabilities for app
support teams
• No repeatable pattern for GM health
activity
• 80% S/R with 2-3 rework cycles
• 50% cause outages
Current Condition
• GM/TLM functionality across all
SDLC environments (capex request
needed)
• change window reduction for non-
prod environments (turn those
around instantaneously less than 13
days)
• Provide read-only to all F5 consoles
• Standardize GM pattern
• Lead Time (post-dev to prod)
• Scrap Rate
• Acquire the F5 hardware or
software to support envs
throughout SDLC
• Make these changes L3 or 5
change requests
• Write automation scripts
• provide read only access to all
environments… can include API
access to facilitate automation
script writing
• Create design template with
customer pattern
◦ Financial approval (Jennifer)
◦ Segregation between environments
(Mark)
◦ Non-standard request types (Susan)
◦ Two network teams with different
rules (Mark)
Process Challenge/Key Pain
Inspiration: A3 management process
Template Example
100. Using Storyboards: Part “Sales”, Part Coaching
Coaching
(Not Solutions)
Facts and
Next Steps
Learner
Coach / Leader
What is the target condition?
What is the actual condition now?
What obstacles do you think are stopping
you from reaching target condition?
What is your next step?
When will we know what was learned from
the next step?
Asks
Questions
Maintain
Storyboard
Answer / Explain
Process Name Challenge/Key Pain
Target
Condition
Current
Condition
Improvement
Metrics
Work ToDo
(Baby Steps)
Blockers
101. Using Storyboards: Part “Sales”, Part Coaching
Coaching
(Not Solutions)
Facts and
Next Steps
Learner
Coach / Leader
What is the target condition?
What is the actual condition now?
What obstacles do you think are stopping
you from reaching target condition?
What is your next step?
When will we know what was learned from
the next step?
Asks
Questions
Maintain
Storyboard
Answer / Explain
Process Name Challenge/Key Pain
Target
Condition
Current
Condition
Improvement
Metrics
Work ToDo
(Baby Steps)
Blockers
Inspiration: Toyota Kata
103. 1. The will to make change happen
2. The resources to make change happen
3. Drive follow-through / clear obstacles
Kaizen Program Oversight
104. 1. The will to make change happen
2. The resources to make change happen
3. Drive follow-through / clear obstacles
Kaizen Program Oversight
This (and only this) is what the Kaizen
Program Oversight Group does!
105. 1. The will to make change happen
2. The resources to make change happen
3. Drive follow-through / clear obstacles
Kaizen Program Oversight
106. 1. The will to make change happen
2. The resources to make change happen
3. Drive follow-through / clear obstacles
Kaizen Program Oversight
Inspire Executives with:
108. DevOps Kaizen Program is an overlay for any delivery methodology
Project 1 Project 2 Project 3
Sprint 1
Improvement Program Oversight (Strategy)
Team
A
Full Retrospective
& Planning
Sprint 2 Sprint 3 Sprint 4 Sprint 5 Sprint 6 Sprint 7 Sprint 8
Team
B
Refresh
Retrospective
109. DevOps Kaizen: Let’s Recap!
Make the work visible Focus on Continuous Improvement
Service
Delivery
Metrics
Kaizen
Program
Oversight
Planning
&
Retrospectives
Informs Informs
Countermeasures &
Blockers
Establish program elements
Project 1 Project 2 Project 3
Sprint 1
Improvement Program Oversight (Strategy)
Team
A
Full Retrospective
& Planning
Sprint 2 Sprint 3 Sprint 4 Sprint 5 Sprint 6 Sprint 7 Sprint 8
Team
B
Refresh
Retrospective
Build into your operating model
110. Join me tomorrow! 10:15 in Victoria Suite
Helping Ops Help You: Development’s Role in Enabling Self-Service Operations