SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
SRE in Apiary
CZJUG 21.5.2018
Ladislav Prskavec
@abtris
1
What is SRE?
2
"What happens when a software engineer is tasked with what used to
be called operations."
» Ben Treynor Sloss, Vice President, Google Engineering,
founder of Google SRE
3
SRE implement DevOps
— Google Cloud
4
Apiary in numbers
» Apiary users: 336,786 Apiary API projects: 440,178
» Apiary engineers: 19
» Apiary platform engineers: 10 + 4
» Apiary SREs: 4
» App deploys: 15 (per week)
» Parsing service invocations [1 day]: ~200k
» CI build: ~19 min (8 parallel workers)
5
How we started with
SRE team
6
2014 - 2 people
software developer and ops guy
7
2015-2017 - 3 people
software developers
8
2018 - 4 people
2 seniors, 2 juniors
9
Culture
Process > Tools
10
No team separation
» bounder context, but ...
» Shared ownership of platform - shared responsibility
» Shared tooling (debug, deploy, monitor)
» Shared codebase
» Brainstorm
» Motivation for good design (monitoring, future debugging)
11
Things break
» They do - better be ready
» Knowing when there's problem (logs, metrics, alerting)
» Having someone there - being oncall
» Responding (mitigation, resolution)
» Learning from it (postmortems)
12
Measure everything
» No gut feeling when we have the data (app metrics, runtime
metrics)
» Both production and non-production systems (e.g. our CI test
time)
» Thresholds, automated alerting
» Visualize the data (oncall dashboard, happiness dashboard)
13
14
15
Gradual changes
» Delivery vs deploy
» Continuous Integration / Continuous Delivery (CI/CD)
Automated testing within CI
» Testing environments (similar to production)
» Short iterations, fast rollbacks
» No-downtime deploy & immutable
» Rolling delivery
16
Tooling & automation
» oncall logistics
» schedules
» escalations
» alerting
» conflicts
» documentation
» runbooks
» internal processes
» domain dictionary
17
Reason 1. Decreasing changes of errors
» Source and great post: http://www.devops.ch/2017/05/10/devops-explained/
18
Reason 2: Eliminating toil, work that is:
» Repetitive
» Automatable
» Doesn't provide enduring value
» Scales linearly with service
» Compounds significantly and surprisingly
19
Reason 3: Focusing on creative
engineering work that:
» Improves reliability
» Improves performance & stability of systems
» Ensures scalability
» Reduces toil
» Is fun: improves morale, speeds up progress, allows skill
development
20
Incidents
Types:
» Low-priority incident
» High-priority incident
» Security incident
Both production and non-production systems
21
Being oncall
» Shared among developers (roles, not individuals, increase bus
factor) Responsible for the platform
» Safety net - you know who to call
» Runbooks - you know what to do
» Early alerting - proactively investigate
22
Incident response
If critical: Incident commander role Separate roles, if necessary:
» outbound and inbound communication
» root cause analysis
» issue mitigation
Tracking time (incident ack expiration) and keeping track Tooling
(alerts, paging, postmortem reminders)
23
Postmortems
» Root cause
» Lessons learned
» Actionable items
» Prevent future issues
» Create runbooks
» Blameless
» Generated reminders
24
Incident reviews
» Weekly, team lead sync
» Reviewing past incidents - types, occurrence, actionability
» Discuss improvements
» Incident fatigue prevention
25
Summary
26
Summary
» Culture is more important than process!
» Start early and work on improvements!
» Product owner for SRE work is useful role!
27
References
» SRE vs. DevOps: competing standards or close friends?
» SRE Weekly
» Awesome Site Reliability Engineering
Books
» SRE book
» Seeking SRE
28

Weitere ähnliche Inhalte

Was ist angesagt?

QA in DevOps: Transformation thru Automation via Jenkins
QA in DevOps:  Transformation thru Automation via JenkinsQA in DevOps:  Transformation thru Automation via Jenkins
QA in DevOps: Transformation thru Automation via Jenkins
Tatyana Kravtsov
 

Was ist angesagt? (20)

DevOps & SRE at Google Scale
DevOps & SRE at Google ScaleDevOps & SRE at Google Scale
DevOps & SRE at Google Scale
 
DevOps as-a-Service (DaaS) value
DevOps as-a-Service (DaaS) valueDevOps as-a-Service (DaaS) value
DevOps as-a-Service (DaaS) value
 
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOpsDOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
 
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
 
You got a couple Microservices, now what? - Adding SRE to DevOps
You got a couple Microservices, now what?  - Adding SRE to DevOpsYou got a couple Microservices, now what?  - Adding SRE to DevOps
You got a couple Microservices, now what? - Adding SRE to DevOps
 
Continuous testing webinar 041017 slideshare
Continuous testing webinar 041017 slideshareContinuous testing webinar 041017 slideshare
Continuous testing webinar 041017 slideshare
 
What's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE ParisWhat's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE Paris
 
SRE in Enterprise - Local Journey DevopsDays Galway
SRE in Enterprise - Local Journey  DevopsDays GalwaySRE in Enterprise - Local Journey  DevopsDays Galway
SRE in Enterprise - Local Journey DevopsDays Galway
 
Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...
Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...
Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
 
DevOps Testing | Continuous Testing In DevOps | DevOps Tutorial | DevOps Trai...
DevOps Testing | Continuous Testing In DevOps | DevOps Tutorial | DevOps Trai...DevOps Testing | Continuous Testing In DevOps | DevOps Tutorial | DevOps Trai...
DevOps Testing | Continuous Testing In DevOps | DevOps Tutorial | DevOps Trai...
 
Roles and Responsibilities of a DevOps Engineer
Roles and Responsibilities of a DevOps EngineerRoles and Responsibilities of a DevOps Engineer
Roles and Responsibilities of a DevOps Engineer
 
Bjorn Rabenstein. SRE, DevOps, Google, and you
Bjorn Rabenstein. SRE, DevOps, Google, and youBjorn Rabenstein. SRE, DevOps, Google, and you
Bjorn Rabenstein. SRE, DevOps, Google, and you
 
The Art of Container Monitoring
The Art of Container MonitoringThe Art of Container Monitoring
The Art of Container Monitoring
 
DevOps Evolution - The Next Generation ?
DevOps Evolution - The Next Generation ?DevOps Evolution - The Next Generation ?
DevOps Evolution - The Next Generation ?
 
2017 DevSecOps Survey
2017 DevSecOps Survey2017 DevSecOps Survey
2017 DevSecOps Survey
 
How to Build the Right Automation
How to Build the Right AutomationHow to Build the Right Automation
How to Build the Right Automation
 
QA in DevOps: Transformation thru Automation via Jenkins
QA in DevOps:  Transformation thru Automation via JenkinsQA in DevOps:  Transformation thru Automation via Jenkins
QA in DevOps: Transformation thru Automation via Jenkins
 
Continuous Delivery Distilled
Continuous Delivery DistilledContinuous Delivery Distilled
Continuous Delivery Distilled
 
The Next Wave of Reliability Engineering
The Next Wave of Reliability EngineeringThe Next Wave of Reliability Engineering
The Next Wave of Reliability Engineering
 

Ähnlich wie SRE in Apiary

Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINALJun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
Alex Tarra
 

Ähnlich wie SRE in Apiary (20)

Software engineering the genesis
Software engineering  the genesisSoftware engineering  the genesis
Software engineering the genesis
 
Create Your Future with z Systems Cloud
Create Your Future with z Systems CloudCreate Your Future with z Systems Cloud
Create Your Future with z Systems Cloud
 
Keynote at-icpc-2020
Keynote at-icpc-2020Keynote at-icpc-2020
Keynote at-icpc-2020
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
Creating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your SystemCreating An Incremental Architecture For Your System
Creating An Incremental Architecture For Your System
 
Please, Please, PLEASE Defend Your Mobile Apps!
Please, Please, PLEASE Defend Your Mobile Apps!Please, Please, PLEASE Defend Your Mobile Apps!
Please, Please, PLEASE Defend Your Mobile Apps!
 
DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...
DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...
DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...
 
Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINALJun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
Jun 08 - PMWT Featured Paper -Tarabykin - XP PAPER - FINAL
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
Agile and Agile methods: what is the most important to understand to succeed
Agile and Agile methods: what is the most important to understand to succeedAgile and Agile methods: what is the most important to understand to succeed
Agile and Agile methods: what is the most important to understand to succeed
 
Right-sized Architecture: Integrity for Emerging Designs
Right-sized Architecture: Integrity for Emerging DesignsRight-sized Architecture: Integrity for Emerging Designs
Right-sized Architecture: Integrity for Emerging Designs
 
Cytoscape CI Chapter 2
Cytoscape CI Chapter 2Cytoscape CI Chapter 2
Cytoscape CI Chapter 2
 
.NET Fest 2019. Леонид Молотиевский. DotNet Core in production
.NET Fest 2019. Леонид Молотиевский. DotNet Core in production.NET Fest 2019. Леонид Молотиевский. DotNet Core in production
.NET Fest 2019. Леонид Молотиевский. DotNet Core in production
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 
DevOps – what is it? Why? Is it real? How to do it?
DevOps – what is it? Why? Is it real? How to do it?DevOps – what is it? Why? Is it real? How to do it?
DevOps – what is it? Why? Is it real? How to do it?
 

Mehr von Ladislav Prskavec

Mehr von Ladislav Prskavec (20)

Modern Web Architecture<br>based on JS, API and Markup
Modern Web Architecture<br>based on JS, API and MarkupModern Web Architecture<br>based on JS, API and Markup
Modern Web Architecture<br>based on JS, API and Markup
 
How you can kill Wordpress!
How you can kill Wordpress!How you can kill Wordpress!
How you can kill Wordpress!
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
 
CI and CD
CI and CDCI and CD
CI and CD
 
Datascript: Serverless Architetecture
Datascript: Serverless ArchitetectureDatascript: Serverless Architetecture
Datascript: Serverless Architetecture
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
CI and CD
CI and CDCI and CD
CI and CD
 
PragueJS meetups 30th anniversary
PragueJS meetups 30th anniversaryPragueJS meetups 30th anniversary
PragueJS meetups 30th anniversary
 
How to easy deploy app into any cloud
How to easy deploy app into any cloudHow to easy deploy app into any cloud
How to easy deploy app into any cloud
 
Docker - modern platform for developement and operations
Docker - modern platform for developement and operationsDocker - modern platform for developement and operations
Docker - modern platform for developement and operations
 
GDGSCL - Docker a jeho provoz v Heroku a AWS
GDGSCL - Docker a jeho provoz v Heroku a AWSGDGSCL - Docker a jeho provoz v Heroku a AWS
GDGSCL - Docker a jeho provoz v Heroku a AWS
 
AWS Elastic Container Service
AWS Elastic Container ServiceAWS Elastic Container Service
AWS Elastic Container Service
 
Comparison nodejs frameworks using Polls API
Comparison nodejs frameworks using Polls APIComparison nodejs frameworks using Polls API
Comparison nodejs frameworks using Polls API
 
Docker Elastic Beanstalk
Docker Elastic BeanstalkDocker Elastic Beanstalk
Docker Elastic Beanstalk
 
Docker včera, dnes a zítra
Docker včera, dnes a zítraDocker včera, dnes a zítra
Docker včera, dnes a zítra
 
Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.
 
Docker.io
Docker.ioDocker.io
Docker.io
 
Docker.io
Docker.ioDocker.io
Docker.io
 
AngularJS
AngularJSAngularJS
AngularJS
 
Firebase and AngularJS
Firebase and AngularJSFirebase and AngularJS
Firebase and AngularJS
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

SRE in Apiary

  • 1. SRE in Apiary CZJUG 21.5.2018 Ladislav Prskavec @abtris 1
  • 3. "What happens when a software engineer is tasked with what used to be called operations." » Ben Treynor Sloss, Vice President, Google Engineering, founder of Google SRE 3
  • 4. SRE implement DevOps — Google Cloud 4
  • 5. Apiary in numbers » Apiary users: 336,786 Apiary API projects: 440,178 » Apiary engineers: 19 » Apiary platform engineers: 10 + 4 » Apiary SREs: 4 » App deploys: 15 (per week) » Parsing service invocations [1 day]: ~200k » CI build: ~19 min (8 parallel workers) 5
  • 6. How we started with SRE team 6
  • 7. 2014 - 2 people software developer and ops guy 7
  • 8. 2015-2017 - 3 people software developers 8
  • 9. 2018 - 4 people 2 seniors, 2 juniors 9
  • 11. No team separation » bounder context, but ... » Shared ownership of platform - shared responsibility » Shared tooling (debug, deploy, monitor) » Shared codebase » Brainstorm » Motivation for good design (monitoring, future debugging) 11
  • 12. Things break » They do - better be ready » Knowing when there's problem (logs, metrics, alerting) » Having someone there - being oncall » Responding (mitigation, resolution) » Learning from it (postmortems) 12
  • 13. Measure everything » No gut feeling when we have the data (app metrics, runtime metrics) » Both production and non-production systems (e.g. our CI test time) » Thresholds, automated alerting » Visualize the data (oncall dashboard, happiness dashboard) 13
  • 14. 14
  • 15. 15
  • 16. Gradual changes » Delivery vs deploy » Continuous Integration / Continuous Delivery (CI/CD) Automated testing within CI » Testing environments (similar to production) » Short iterations, fast rollbacks » No-downtime deploy & immutable » Rolling delivery 16
  • 17. Tooling & automation » oncall logistics » schedules » escalations » alerting » conflicts » documentation » runbooks » internal processes » domain dictionary 17
  • 18. Reason 1. Decreasing changes of errors » Source and great post: http://www.devops.ch/2017/05/10/devops-explained/ 18
  • 19. Reason 2: Eliminating toil, work that is: » Repetitive » Automatable » Doesn't provide enduring value » Scales linearly with service » Compounds significantly and surprisingly 19
  • 20. Reason 3: Focusing on creative engineering work that: » Improves reliability » Improves performance & stability of systems » Ensures scalability » Reduces toil » Is fun: improves morale, speeds up progress, allows skill development 20
  • 21. Incidents Types: » Low-priority incident » High-priority incident » Security incident Both production and non-production systems 21
  • 22. Being oncall » Shared among developers (roles, not individuals, increase bus factor) Responsible for the platform » Safety net - you know who to call » Runbooks - you know what to do » Early alerting - proactively investigate 22
  • 23. Incident response If critical: Incident commander role Separate roles, if necessary: » outbound and inbound communication » root cause analysis » issue mitigation Tracking time (incident ack expiration) and keeping track Tooling (alerts, paging, postmortem reminders) 23
  • 24. Postmortems » Root cause » Lessons learned » Actionable items » Prevent future issues » Create runbooks » Blameless » Generated reminders 24
  • 25. Incident reviews » Weekly, team lead sync » Reviewing past incidents - types, occurrence, actionability » Discuss improvements » Incident fatigue prevention 25
  • 27. Summary » Culture is more important than process! » Start early and work on improvements! » Product owner for SRE work is useful role! 27
  • 28. References » SRE vs. DevOps: competing standards or close friends? » SRE Weekly » Awesome Site Reliability Engineering Books » SRE book » Seeking SRE 28