Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2EzfvM9.
Greg Burrell presents Netflix’s journey from siloed teams to their Full Cycle Developer model for building and operating their services at Netflix. He discusses the various approaches they’ve tried, the motivations that pushed them to keep evolving, and the lessons learned along the way. Filmed at qconsf.com.
Greg Burrell works as a Senior SRE at Netflix and is a member of the Playback Reliability Team.
2. InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
netflix-devops
3. Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
5. GREG BURRELL
QCON SF 2018
Who is Greg Burrell?
● 13 years at Netflix
● 7 years streaming on-call
● Senior Reliability Engineer (SRE) in
Edge Developer Productivity Team
@gburrell_greg
14. GREG BURRELL
QCON SF 2018
Lack of Context
● Developers & Testers
didn’t know the
production systems.
● Devops & NOC/CORE
didn’t know the apps.
“Let’s find somebody who knows…”
● High communications
overhead.
15. GREG BURRELL
QCON SF 2018
Lengthy Troubleshooting and Fixing
● People moved cautiously due to lack of familiarity with
applications, systems, and current state.
“Let’s get everybody on
the conference call and
all talk at once.”
● Fixing production was a lot of back-and-forth over the phone.
16. GREG BURRELL
QCON SF 2018
Lossy Feedback Cycle
● Developers stayed away from production unless something was on fire.
● Operations teams would band-aid over problems.
“This graph changed
after the deployment.
Can somebody
Take a look?”
17. GREG BURRELL
QCON SF 2018
● Coordination across multiple teams.
● Understaffed team = bottleneck.
Silos
“I’m not sure what’s going
on with the release,
I think we’re waiting on
somebody...”
32. GREG BURRELL
QCON SF 2018
Staffing
● Isn’t this just squeezing
more work out of
developers?
● Teams must be staffed to
manage deployments,
production issues, and
support requests.
33. GREG BURRELL
QCON SF 2018
Training
● Developers have to expand
skill sets.
● Training needs dedicated
focus and resources.
34. GREG BURRELL
QCON SF 2018
Commitment and Prioritization
● Managers must be willing to invest in
staffing, training, and tools.
● Prioritize testing, operations automation,
and support alongside feature
development.
36. GREG BURRELL
QCON SF 2018
Not For Everyone
● Not for every team.
● Some developers just
want to develop.
● Change is scary.
37. GREG BURRELL
QCON SF 2018
● Additional cognitive load
increases risk of burnout.
● More interruptions.
● Need to balance more
priorities.
Increase in Breadth
40. GREG BURRELL
QCON SF 2018
Improving on this Model
● Tooling! Tooling! Tooling!
● Metrics to measure each aspect of
the software life cycle.
● Metrics to measure ourselves.