18. Successful companies say:
“Failure Happens”
“Embrace Failure”
“Design For Failure”
“Healthy attitude about Failure”
“Resilient (to Failure)”
THE OUTAGES WILL CONTINUE
UNTIL THE APPROACH
IMPROVES ;-)
19. GameDay
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
http://www.flickr.com/photos/dnorman/2678090600
20. define:
GameDay
An exercise designed to increase
Resilience through large-scale fault
injection across critical systems.
Part of a larger discipline called
Resilience Engineering.
Not new, just new to us ;-)
22. GameDay increases Resilience in 3 ways
Preparation
‣ Identification and mitigation of risks and impact from
failure
‣ Reduces frequency of failure (MTBF)
‣ Reduces duration of recovery (MTTR)
Participation
‣ Builds confidence & competence responding to failure
and under stress.
‣ Strengthens individual and cultural ability to anticipate,
mitigate, respond to, and recover from failures of all
types.
Exercises
‣ Trigger and expose “latent defects”
‣ Choose when discover them, instead of letting that be
determined by the next real disaster.
33. Infrastructure as Code:
Enable the reconstruction of the
business from nothing but a
source code repository, an
application data backup, and
bare resources.
35. Golden Images are not the answer
• Gold is heavy & expensive
• Hard to transport
• Hard to mold
• Easy to lose configuration detail
http://www.flickr.com/photos/garysoup/2977173063/
36. When this
Varnish
Jboss App
Memcache
Postgres Slaves
Postgres Master
53. GameDay
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
http://www.flickr.com/photos/dnorman/2678090600