SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Making Software. Better.
Simple solutions to big business problems.
Equal Experts is a network of talented, experienced, software
consultants, specialising in agile delivery.
Embracing
collaborative chaos
Running chaos days on large platforms
Lyndsay Prewer | @equalexperts
Photo by Darius Bashar on Unsplash
What is chaos engineering
and why should we care?
Look at what I built today!
Google Cloud Dataflow In the Smart Home Data Pipeline
Operating on the edge of chaos
http://bit.ly/2ZavoyP
http://bit.ly/2QVeWzA
“Two
normally-benign
misconfigurations,
and a specific
software bug,
combined to initiate
the outage”
Predicting failure
Google Cloud Dataflow In the Smart Home Data Pipeline
● How many component parts does
your system have?
● How are they connected?
● How reliable is each part?
● How reliable are the connections?
● What happens when X fails?
Addressing the risk of unexpected failure
A
B
A
B D
C
Z
E
G H
F
I
● Address risk by deliberate
inducing failure
● Observe, reflect and improve
● Build resilience in (like quality)
● Think about production (and
failure) all the time
Simples Hard
Chaos engineering approaches
Manual
In process
Automated
Unplanned
Manual chaos
● Chaos Days
● AWS Game Days
● Change specific chaos
● Chaos monkey / Simian Army
● AWS spot instances / GCP Preemptible VMs
● Randomised pod killer
Automated chaos
In process chaos
● Part of normal engineering process
● Focus for all roles in the team
● Production thinking / building resilience in
Product
Owner
Dev QA Dev Ops
Focus on: Quality AND Production AND Resilience
Define Build Explore Deploy
Unplanned chaos
● Every day is a school day
● Handle incidents well
● Learn from incidents - post incident
reviews
● AWS podcast: http://bit.ly/31oQfAf
A
B D
C
Z
E
G H
F
I
How does it help?
People
ProcessProduct
Knowledge
Behaviour
Expertise
Managing incidents
Learning from incidents
Engineering approach
Observability
Simplification
Alerting
Runbooks
Resilience
Photo by Darius Bashar on Unsplash
Running a Chaos Day
- when and how?
Our context
Legacy systems
X00 million
internal
requests
(busiest day)
X00 million
log messages
(busiest day)
x850
microservices
XXm Customers
60 Delivery teams
~1000 Microservices
Loren ipsum caveat empor
Loren ipsum caveat empor. Loren ipsum
caveat empor. Loren ipsum caveat empor
Loren ipsum caveat empor.
Loren ipsum caveat empor
Loren ipsum caveat empor. Loren ipsum
caveat empor. Loren ipsum caveat empor
Loren ipsum caveat empor.
Loren ipsum caveat empor
Loren ipsum caveat empor. Loren ipsum
caveat empor. Loren ipsum caveat empor
Loren ipsum caveat empor.
6 Platform teams
(AWS PaaS)
When were we ready for chaos?
2013 2014
Cloud
Docker
Scala
Mongo
ELK
Fast
growth
(teams,
services,
traffic)
When were we ready for chaos?
2013 2014 2015 2016
Cloud
Docker
Scala
Mongo
ELK
Fast
growth
(teams,
services,
traffic)
Multi
active WIP
Multi
active
When were we ready for chaos?
2013 2014 2015 2016 2017 2018
Cloud
Docker
Scala
Mongo
ELK
Fast
growth
(teams,
services,
traffic)
Multi
active WIP
Multi
active
More multi
active
(to AWS)
Self serve
deploys
AWS
Ready
for
Chaos
When are you ready for chaos?
Manual
In process
Automated
Unplanned
Photo by Darius Bashar on Unsplash
Who, where and exactly how?
Agents of chaos
● Virtual, closed team
● Draw from component
teams
● Experts / veterans
● Highest bus factor
Chaos scope - know thyself
● Know your architecture
● Know your steady state
● Know your constraints
○ What’s in your control?
○ What’s not?
○ What needs protecting?
Loren ipsum caveat empor
Loren ipsum caveat empor. Loren ipsum
caveat empor. Loren ipsum caveat empor
Loren ipsum caveat empor.
X00 million
internal
requests
(busiest day)
X00 million
log messages
(busiest day)
Chaos scope - trust the brains-storm
http://bit.ly/2XzR7Q9
Chaos scope - brainstorm, then plan the
detail
Team X Team Y Team Z
Chaos scope - hack in amongst the chaos
Team X Team Y Team Z
Deciding where
● Production or closest to it
● Production (like) load
● Production (like) telemetry
● Decide the blast radius
● Decide comm’s channel(s)
Production
Staging
QA
Development
Photo by Darius Bashar on Unsplash
Execution
Deciding when
● To warn or not
● What else is going on?
● It was just an ordinary day …
● Chaos cut-off
Keep calm and chaos on (agents)
● Co-locate the agents
● Collaborate and coordinate well
● Time-box, cover ground
● (Self) document well
Keep calm and chaos on (everyone else)
● Also (self) document well
● Pretend it’s Production on
● It was just an ordinary day ...
Photo by Darius Bashar on Unsplash
Retrospection
Divide and conquer, then regroup
● Major on engineering
improvements (people, process,
product)
● Minor on chaos day improvements
● Component teams retro’s /
incident reviews first
● Then team-of-teams retro
People
ProcessProduct
Team X
Team Y
Team Z
Team of
teams
What did we learn?
● Manage/limit the pain
● Start small
● Production is a tough step
● Production-like is also hard!
● Have fun!
Photo by Darius Bashar on Unsplash
What next?
What’s your next chaos step?
Manual
In process
Automated
Unplanned
● Where are you at in the journey?
● What’s the next (baby) step?
● Need any help?
Thank You
United Kingdom
+44 203 603 7830
helloUK@equalexperts.com
Equal Experts UK Ltd
30 Brock Street
London NW1 3FG
India
+91 20 6607 7763
helloIndia@equalexperts.com
Equal Experts India Private Ltd
Office No. 4-C
Cerebrum IT Park No. B3
Kumar City, Kalyani Nagar
Pune, 411006
Canada
+1 403 775 4861
helloCanada@equalexperts.com
Equal Experts Devices Inc
205 - 279 Midpark way S.E.
T2X 1M2
Calgary, Alberta
Portugal
+351 211 378 414
helloPortugal@equalexperts.com
Equal Experts Portugal
Avenida Dom João II, Nº35
Edificio Infante 11ºA
1990-083 Parque das Nações
Lisboa – Portugal
Thank You
USA
+1 866-943-9737
helloUSA@equalexperts.com
Equal Experts Inc
1460 Broadway
New York
NY 10036
 
LinkedIn
linkedin.com/company/equal-experts
Twitter
@EqualExperts
Web
www.equalexperts.com

Weitere ähnliche Inhalte

Was ist angesagt?

The Journey of devops and continuous delivery in a Large Financial Institution
The Journey of devops and continuous delivery in a Large Financial InstitutionThe Journey of devops and continuous delivery in a Large Financial Institution
The Journey of devops and continuous delivery in a Large Financial InstitutionKris Buytaert
 
Run stuff, Deploy Stuff, Jax London 2017 Edition
Run stuff, Deploy Stuff, Jax London 2017 EditionRun stuff, Deploy Stuff, Jax London 2017 Edition
Run stuff, Deploy Stuff, Jax London 2017 EditionKris Buytaert
 
Docker In Production Now: Seattle Docker Meetup March 2015
Docker In Production Now: Seattle Docker Meetup March 2015Docker In Production Now: Seattle Docker Meetup March 2015
Docker In Production Now: Seattle Docker Meetup March 2015Justin Clayton
 
Monitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code AgeMonitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code AgeKris Buytaert
 
Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Kris Buytaert
 
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOpsJulien Pivotto
 
Devops, the future is here it's not evenly distributed yet
Devops, the future is here it's not evenly distributed yetDevops, the future is here it's not evenly distributed yet
Devops, the future is here it's not evenly distributed yetKris Buytaert
 
DevOps Picc12 Management Talk
DevOps Picc12 Management TalkDevOps Picc12 Management Talk
DevOps Picc12 Management TalkMichael Rembetsy
 
Sustainable development of an organization -- LKFR14
Sustainable development of an organization -- LKFR14Sustainable development of an organization -- LKFR14
Sustainable development of an organization -- LKFR14Lean Kanban France
 
From Waterfall to Agile: A ScrumMaster’s View
From Waterfall to Agile: A ScrumMaster’s ViewFrom Waterfall to Agile: A ScrumMaster’s View
From Waterfall to Agile: A ScrumMaster’s ViewTechWell
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsMike Brittain
 
Helping Ops Help You: Development’s Role in Enabling Self-Service Operations
Helping Ops Help You:  Development’s Role in Enabling Self-Service OperationsHelping Ops Help You:  Development’s Role in Enabling Self-Service Operations
Helping Ops Help You: Development’s Role in Enabling Self-Service OperationsRundeck
 
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLife
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLifeLearn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLife
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLifeDocker, Inc.
 
DevOps: 6 Steps to Go Faster, Build Better and Avoid Disaster
DevOps: 6 Steps to Go Faster, Build Better and Avoid DisasterDevOps: 6 Steps to Go Faster, Build Better and Avoid Disaster
DevOps: 6 Steps to Go Faster, Build Better and Avoid DisasterSmartBear
 
Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous DeploymentTimothy Fitz
 

Was ist angesagt? (18)

DevOps Requires Agility
DevOps Requires AgilityDevOps Requires Agility
DevOps Requires Agility
 
The Journey of devops and continuous delivery in a Large Financial Institution
The Journey of devops and continuous delivery in a Large Financial InstitutionThe Journey of devops and continuous delivery in a Large Financial Institution
The Journey of devops and continuous delivery in a Large Financial Institution
 
Run stuff, Deploy Stuff, Jax London 2017 Edition
Run stuff, Deploy Stuff, Jax London 2017 EditionRun stuff, Deploy Stuff, Jax London 2017 Edition
Run stuff, Deploy Stuff, Jax London 2017 Edition
 
Docker In Production Now: Seattle Docker Meetup March 2015
Docker In Production Now: Seattle Docker Meetup March 2015Docker In Production Now: Seattle Docker Meetup March 2015
Docker In Production Now: Seattle Docker Meetup March 2015
 
SDLC & DevSecOps
SDLC & DevSecOpsSDLC & DevSecOps
SDLC & DevSecOps
 
Monitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code AgeMonitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code Age
 
Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.
 
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOps
 
Devops, the future is here it's not evenly distributed yet
Devops, the future is here it's not evenly distributed yetDevops, the future is here it's not evenly distributed yet
Devops, the future is here it's not evenly distributed yet
 
DevOps Picc12 Management Talk
DevOps Picc12 Management TalkDevOps Picc12 Management Talk
DevOps Picc12 Management Talk
 
Sustainable development of an organization -- LKFR14
Sustainable development of an organization -- LKFR14Sustainable development of an organization -- LKFR14
Sustainable development of an organization -- LKFR14
 
From Waterfall to Agile: A ScrumMaster’s View
From Waterfall to Agile: A ScrumMaster’s ViewFrom Waterfall to Agile: A ScrumMaster’s View
From Waterfall to Agile: A ScrumMaster’s View
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
The devops laboratory - 1 year later
The devops laboratory - 1 year laterThe devops laboratory - 1 year later
The devops laboratory - 1 year later
 
Helping Ops Help You: Development’s Role in Enabling Self-Service Operations
Helping Ops Help You:  Development’s Role in Enabling Self-Service OperationsHelping Ops Help You:  Development’s Role in Enabling Self-Service Operations
Helping Ops Help You: Development’s Role in Enabling Self-Service Operations
 
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLife
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLifeLearn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLife
Learn Fast, Fail Fast, Deliver Fast: The MOD Squad Way at MetLife
 
DevOps: 6 Steps to Go Faster, Build Better and Avoid Disaster
DevOps: 6 Steps to Go Faster, Build Better and Avoid DisasterDevOps: 6 Steps to Go Faster, Build Better and Avoid Disaster
DevOps: 6 Steps to Go Faster, Build Better and Avoid Disaster
 
Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous Deployment
 

Ähnlich wie Embracing collaborative chaos

From devoops to devops
From devoops to devopsFrom devoops to devops
From devoops to devopsKris Buytaert
 
RSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering
RSA Conference APJ 2019 DevSecOps Days Security Chaos EngineeringRSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering
RSA Conference APJ 2019 DevSecOps Days Security Chaos EngineeringAaron Rinehart
 
OWASP AppSec Global 2019 Security & Chaos Engineering
OWASP AppSec Global 2019 Security & Chaos EngineeringOWASP AppSec Global 2019 Security & Chaos Engineering
OWASP AppSec Global 2019 Security & Chaos EngineeringAaron Rinehart
 
Dev secops opsec, devsec, devops ?
Dev secops opsec, devsec, devops ?Dev secops opsec, devsec, devops ?
Dev secops opsec, devsec, devops ?Kris Buytaert
 
How Product Managers Thrive in a DevOps World
How Product Managers Thrive in a DevOps WorldHow Product Managers Thrive in a DevOps World
How Product Managers Thrive in a DevOps WorldAtlassian
 
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Burr Sutter
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck
 
AllTheTalks Security Chaos Engineering
AllTheTalks Security Chaos Engineering AllTheTalks Security Chaos Engineering
AllTheTalks Security Chaos Engineering Aaron Rinehart
 
Moby is killing your devops efforts
Moby is killing your devops effortsMoby is killing your devops efforts
Moby is killing your devops effortsKris Buytaert
 
Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Rundeck
 
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"Aaron Rinehart
 
Devops is dead, Long Live Devops
Devops is dead, Long Live DevopsDevops is dead, Long Live Devops
Devops is dead, Long Live DevopsKris Buytaert
 
Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Rundeck
 
Chaos Engineering 101 by Russ Miles
Chaos Engineering 101 by Russ MilesChaos Engineering 101 by Russ Miles
Chaos Engineering 101 by Russ MilesRussell Miles
 
Making Software for the Software Makers: How Atlassian Teams use Jira Software
Making Software for the Software Makers: How Atlassian Teams use Jira SoftwareMaking Software for the Software Makers: How Atlassian Teams use Jira Software
Making Software for the Software Makers: How Atlassian Teams use Jira SoftwareAtlassian
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure FirstKris Buytaert
 

Ähnlich wie Embracing collaborative chaos (20)

From devoops to devops
From devoops to devopsFrom devoops to devops
From devoops to devops
 
RSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering
RSA Conference APJ 2019 DevSecOps Days Security Chaos EngineeringRSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering
RSA Conference APJ 2019 DevSecOps Days Security Chaos Engineering
 
OWASP AppSec Global 2019 Security & Chaos Engineering
OWASP AppSec Global 2019 Security & Chaos EngineeringOWASP AppSec Global 2019 Security & Chaos Engineering
OWASP AppSec Global 2019 Security & Chaos Engineering
 
Dev secops opsec, devsec, devops ?
Dev secops opsec, devsec, devops ?Dev secops opsec, devsec, devops ?
Dev secops opsec, devsec, devops ?
 
How Product Managers Thrive in a DevOps World
How Product Managers Thrive in a DevOps WorldHow Product Managers Thrive in a DevOps World
How Product Managers Thrive in a DevOps World
 
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
 
Chaos is a ladder !
Chaos is a ladder !Chaos is a ladder !
Chaos is a ladder !
 
AllTheTalks Security Chaos Engineering
AllTheTalks Security Chaos Engineering AllTheTalks Security Chaos Engineering
AllTheTalks Security Chaos Engineering
 
What DevOps Isn't
What DevOps Isn'tWhat DevOps Isn't
What DevOps Isn't
 
Moby is killing your devops efforts
Moby is killing your devops effortsMoby is killing your devops efforts
Moby is killing your devops efforts
 
Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE
 
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
VMWare Tech Talk: "The Road from Rugged DevOps to Security Chaos Engineering"
 
Devops is dead, Long Live Devops
Devops is dead, Long Live DevopsDevops is dead, Long Live Devops
Devops is dead, Long Live Devops
 
Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE Incident Management in the Age of DevOps and SRE
Incident Management in the Age of DevOps and SRE
 
Chaos Engineering 101 by Russ Miles
Chaos Engineering 101 by Russ MilesChaos Engineering 101 by Russ Miles
Chaos Engineering 101 by Russ Miles
 
Continuous Delivery
Continuous DeliveryContinuous Delivery
Continuous Delivery
 
ROOTS2011 Continuous Delivery
ROOTS2011 Continuous DeliveryROOTS2011 Continuous Delivery
ROOTS2011 Continuous Delivery
 
Making Software for the Software Makers: How Atlassian Teams use Jira Software
Making Software for the Software Makers: How Atlassian Teams use Jira SoftwareMaking Software for the Software Makers: How Atlassian Teams use Jira Software
Making Software for the Software Makers: How Atlassian Teams use Jira Software
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 

Mehr von Equal Experts

TRUST Framework Talk 2023-03-10.pptx
TRUST Framework Talk 2023-03-10.pptxTRUST Framework Talk 2023-03-10.pptx
TRUST Framework Talk 2023-03-10.pptxEqual Experts
 
Will it matter if your child cannot code?
Will it matter if your child cannot code?Will it matter if your child cannot code?
Will it matter if your child cannot code?Equal Experts
 
Platform Security IRL: Busting Buzzwords & Building Better
Platform Security IRL:  Busting Buzzwords & Building BetterPlatform Security IRL:  Busting Buzzwords & Building Better
Platform Security IRL: Busting Buzzwords & Building BetterEqual Experts
 
Software development practices & Infrastructure as Code - how well do they wo...
Software development practices & Infrastructure as Code - how well do they wo...Software development practices & Infrastructure as Code - how well do they wo...
Software development practices & Infrastructure as Code - how well do they wo...Equal Experts
 
A Whole Team Approach to Quality in Continuous Delivery - Lisa Crispin
A Whole Team Approach to Quality in Continuous Delivery - Lisa CrispinA Whole Team Approach to Quality in Continuous Delivery - Lisa Crispin
A Whole Team Approach to Quality in Continuous Delivery - Lisa CrispinEqual Experts
 
Secure Continuous Delivery
Secure Continuous DeliverySecure Continuous Delivery
Secure Continuous DeliveryEqual Experts
 
Smoothing the continuous delivery path a tale of two architectures - expert...
Smoothing the continuous delivery path   a tale of two architectures - expert...Smoothing the continuous delivery path   a tale of two architectures - expert...
Smoothing the continuous delivery path a tale of two architectures - expert...Equal Experts
 
Design Systems: Designing out Waste, Designing in Consistency
Design Systems: Designing out Waste, Designing in ConsistencyDesign Systems: Designing out Waste, Designing in Consistency
Design Systems: Designing out Waste, Designing in ConsistencyEqual Experts
 
Growing Together - software development in the Developing world
Growing Together - software development in the Developing worldGrowing Together - software development in the Developing world
Growing Together - software development in the Developing worldEqual Experts
 
Infrastructure - a journey from datacentres to cloud
Infrastructure - a journey from datacentres to cloudInfrastructure - a journey from datacentres to cloud
Infrastructure - a journey from datacentres to cloudEqual Experts
 
Data Science In Action: Prenatal Screening for Down Syndrome
Data Science In Action: Prenatal Screening for Down SyndromeData Science In Action: Prenatal Screening for Down Syndrome
Data Science In Action: Prenatal Screening for Down SyndromeEqual Experts
 
The essentials of the IT industry or What I wish I was taught about at Univer...
The essentials of the IT industry or What I wish I was taught about at Univer...The essentials of the IT industry or What I wish I was taught about at Univer...
The essentials of the IT industry or What I wish I was taught about at Univer...Equal Experts
 
Secrets of an agile transformation
Secrets of an agile transformationSecrets of an agile transformation
Secrets of an agile transformationEqual Experts
 
Obstacles of Digital Transformation Evolution
Obstacles of Digital Transformation EvolutionObstacles of Digital Transformation Evolution
Obstacles of Digital Transformation EvolutionEqual Experts
 
Avoiding the security brick
Avoiding the security brickAvoiding the security brick
Avoiding the security brickEqual Experts
 
Organising for Continuous Delivery
Organising for Continuous DeliveryOrganising for Continuous Delivery
Organising for Continuous DeliveryEqual Experts
 
Cracking passwords via common topologies
Cracking passwords via common topologiesCracking passwords via common topologies
Cracking passwords via common topologiesEqual Experts
 
Inception Phases - Handling Complexity
Inception Phases - Handling ComplexityInception Phases - Handling Complexity
Inception Phases - Handling ComplexityEqual Experts
 
Smoothing the Continuous Delivery Path - A Tale of Two Teams
Smoothing the Continuous Delivery Path - A Tale of Two TeamsSmoothing the Continuous Delivery Path - A Tale of Two Teams
Smoothing the Continuous Delivery Path - A Tale of Two TeamsEqual Experts
 

Mehr von Equal Experts (20)

TRUST Framework Talk 2023-03-10.pptx
TRUST Framework Talk 2023-03-10.pptxTRUST Framework Talk 2023-03-10.pptx
TRUST Framework Talk 2023-03-10.pptx
 
Will it matter if your child cannot code?
Will it matter if your child cannot code?Will it matter if your child cannot code?
Will it matter if your child cannot code?
 
Platform Security IRL: Busting Buzzwords & Building Better
Platform Security IRL:  Busting Buzzwords & Building BetterPlatform Security IRL:  Busting Buzzwords & Building Better
Platform Security IRL: Busting Buzzwords & Building Better
 
Software development practices & Infrastructure as Code - how well do they wo...
Software development practices & Infrastructure as Code - how well do they wo...Software development practices & Infrastructure as Code - how well do they wo...
Software development practices & Infrastructure as Code - how well do they wo...
 
A Whole Team Approach to Quality in Continuous Delivery - Lisa Crispin
A Whole Team Approach to Quality in Continuous Delivery - Lisa CrispinA Whole Team Approach to Quality in Continuous Delivery - Lisa Crispin
A Whole Team Approach to Quality in Continuous Delivery - Lisa Crispin
 
Secure Continuous Delivery
Secure Continuous DeliverySecure Continuous Delivery
Secure Continuous Delivery
 
Smoothing the continuous delivery path a tale of two architectures - expert...
Smoothing the continuous delivery path   a tale of two architectures - expert...Smoothing the continuous delivery path   a tale of two architectures - expert...
Smoothing the continuous delivery path a tale of two architectures - expert...
 
Design Systems: Designing out Waste, Designing in Consistency
Design Systems: Designing out Waste, Designing in ConsistencyDesign Systems: Designing out Waste, Designing in Consistency
Design Systems: Designing out Waste, Designing in Consistency
 
Growing Together - software development in the Developing world
Growing Together - software development in the Developing worldGrowing Together - software development in the Developing world
Growing Together - software development in the Developing world
 
Infrastructure - a journey from datacentres to cloud
Infrastructure - a journey from datacentres to cloudInfrastructure - a journey from datacentres to cloud
Infrastructure - a journey from datacentres to cloud
 
Data Science In Action: Prenatal Screening for Down Syndrome
Data Science In Action: Prenatal Screening for Down SyndromeData Science In Action: Prenatal Screening for Down Syndrome
Data Science In Action: Prenatal Screening for Down Syndrome
 
The essentials of the IT industry or What I wish I was taught about at Univer...
The essentials of the IT industry or What I wish I was taught about at Univer...The essentials of the IT industry or What I wish I was taught about at Univer...
The essentials of the IT industry or What I wish I was taught about at Univer...
 
Secrets of an agile transformation
Secrets of an agile transformationSecrets of an agile transformation
Secrets of an agile transformation
 
Obstacles of Digital Transformation Evolution
Obstacles of Digital Transformation EvolutionObstacles of Digital Transformation Evolution
Obstacles of Digital Transformation Evolution
 
Avoiding the security brick
Avoiding the security brickAvoiding the security brick
Avoiding the security brick
 
Continuous Security
Continuous SecurityContinuous Security
Continuous Security
 
Organising for Continuous Delivery
Organising for Continuous DeliveryOrganising for Continuous Delivery
Organising for Continuous Delivery
 
Cracking passwords via common topologies
Cracking passwords via common topologiesCracking passwords via common topologies
Cracking passwords via common topologies
 
Inception Phases - Handling Complexity
Inception Phases - Handling ComplexityInception Phases - Handling Complexity
Inception Phases - Handling Complexity
 
Smoothing the Continuous Delivery Path - A Tale of Two Teams
Smoothing the Continuous Delivery Path - A Tale of Two TeamsSmoothing the Continuous Delivery Path - A Tale of Two Teams
Smoothing the Continuous Delivery Path - A Tale of Two Teams
 

Kürzlich hochgeladen

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 

Kürzlich hochgeladen (20)

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 

Embracing collaborative chaos

  • 1. Making Software. Better. Simple solutions to big business problems. Equal Experts is a network of talented, experienced, software consultants, specialising in agile delivery.
  • 2. Embracing collaborative chaos Running chaos days on large platforms Lyndsay Prewer | @equalexperts
  • 3. Photo by Darius Bashar on Unsplash What is chaos engineering and why should we care?
  • 4. Look at what I built today! Google Cloud Dataflow In the Smart Home Data Pipeline
  • 5. Operating on the edge of chaos http://bit.ly/2ZavoyP http://bit.ly/2QVeWzA “Two normally-benign misconfigurations, and a specific software bug, combined to initiate the outage”
  • 6. Predicting failure Google Cloud Dataflow In the Smart Home Data Pipeline ● How many component parts does your system have? ● How are they connected? ● How reliable is each part? ● How reliable are the connections? ● What happens when X fails?
  • 7. Addressing the risk of unexpected failure A B A B D C Z E G H F I ● Address risk by deliberate inducing failure ● Observe, reflect and improve ● Build resilience in (like quality) ● Think about production (and failure) all the time Simples Hard
  • 8. Chaos engineering approaches Manual In process Automated Unplanned
  • 9. Manual chaos ● Chaos Days ● AWS Game Days ● Change specific chaos
  • 10. ● Chaos monkey / Simian Army ● AWS spot instances / GCP Preemptible VMs ● Randomised pod killer Automated chaos
  • 11. In process chaos ● Part of normal engineering process ● Focus for all roles in the team ● Production thinking / building resilience in Product Owner Dev QA Dev Ops Focus on: Quality AND Production AND Resilience Define Build Explore Deploy
  • 12. Unplanned chaos ● Every day is a school day ● Handle incidents well ● Learn from incidents - post incident reviews ● AWS podcast: http://bit.ly/31oQfAf A B D C Z E G H F I
  • 13. How does it help? People ProcessProduct Knowledge Behaviour Expertise Managing incidents Learning from incidents Engineering approach Observability Simplification Alerting Runbooks Resilience
  • 14. Photo by Darius Bashar on Unsplash Running a Chaos Day - when and how?
  • 15. Our context Legacy systems X00 million internal requests (busiest day) X00 million log messages (busiest day) x850 microservices XXm Customers 60 Delivery teams ~1000 Microservices Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. 6 Platform teams (AWS PaaS)
  • 16. When were we ready for chaos? 2013 2014 Cloud Docker Scala Mongo ELK Fast growth (teams, services, traffic)
  • 17. When were we ready for chaos? 2013 2014 2015 2016 Cloud Docker Scala Mongo ELK Fast growth (teams, services, traffic) Multi active WIP Multi active
  • 18. When were we ready for chaos? 2013 2014 2015 2016 2017 2018 Cloud Docker Scala Mongo ELK Fast growth (teams, services, traffic) Multi active WIP Multi active More multi active (to AWS) Self serve deploys AWS Ready for Chaos
  • 19. When are you ready for chaos? Manual In process Automated Unplanned
  • 20. Photo by Darius Bashar on Unsplash Who, where and exactly how?
  • 21. Agents of chaos ● Virtual, closed team ● Draw from component teams ● Experts / veterans ● Highest bus factor
  • 22. Chaos scope - know thyself ● Know your architecture ● Know your steady state ● Know your constraints ○ What’s in your control? ○ What’s not? ○ What needs protecting? Loren ipsum caveat empor Loren ipsum caveat empor. Loren ipsum caveat empor. Loren ipsum caveat empor Loren ipsum caveat empor. X00 million internal requests (busiest day) X00 million log messages (busiest day)
  • 23. Chaos scope - trust the brains-storm http://bit.ly/2XzR7Q9
  • 24. Chaos scope - brainstorm, then plan the detail Team X Team Y Team Z
  • 25. Chaos scope - hack in amongst the chaos Team X Team Y Team Z
  • 26. Deciding where ● Production or closest to it ● Production (like) load ● Production (like) telemetry ● Decide the blast radius ● Decide comm’s channel(s) Production Staging QA Development
  • 27. Photo by Darius Bashar on Unsplash Execution
  • 28. Deciding when ● To warn or not ● What else is going on? ● It was just an ordinary day … ● Chaos cut-off
  • 29. Keep calm and chaos on (agents) ● Co-locate the agents ● Collaborate and coordinate well ● Time-box, cover ground ● (Self) document well
  • 30. Keep calm and chaos on (everyone else) ● Also (self) document well ● Pretend it’s Production on ● It was just an ordinary day ...
  • 31. Photo by Darius Bashar on Unsplash Retrospection
  • 32. Divide and conquer, then regroup ● Major on engineering improvements (people, process, product) ● Minor on chaos day improvements ● Component teams retro’s / incident reviews first ● Then team-of-teams retro People ProcessProduct Team X Team Y Team Z Team of teams
  • 33. What did we learn? ● Manage/limit the pain ● Start small ● Production is a tough step ● Production-like is also hard! ● Have fun!
  • 34. Photo by Darius Bashar on Unsplash What next?
  • 35. What’s your next chaos step? Manual In process Automated Unplanned ● Where are you at in the journey? ● What’s the next (baby) step? ● Need any help?
  • 36. Thank You United Kingdom +44 203 603 7830 helloUK@equalexperts.com Equal Experts UK Ltd 30 Brock Street London NW1 3FG India +91 20 6607 7763 helloIndia@equalexperts.com Equal Experts India Private Ltd Office No. 4-C Cerebrum IT Park No. B3 Kumar City, Kalyani Nagar Pune, 411006 Canada +1 403 775 4861 helloCanada@equalexperts.com Equal Experts Devices Inc 205 - 279 Midpark way S.E. T2X 1M2 Calgary, Alberta Portugal +351 211 378 414 helloPortugal@equalexperts.com Equal Experts Portugal Avenida Dom João II, Nº35 Edificio Infante 11ºA 1990-083 Parque das Nações Lisboa – Portugal Thank You USA +1 866-943-9737 helloUSA@equalexperts.com Equal Experts Inc 1460 Broadway New York NY 10036   LinkedIn linkedin.com/company/equal-experts Twitter @EqualExperts Web www.equalexperts.com