Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Twilio Signal 2016 Chaos Patterns

521 Aufrufe

Veröffentlicht am

Lessons about failing well and failing often
Seek progress over perfection

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Twilio Signal 2016 Chaos Patterns

  1. 1. a CHAOS PATTERNS BRUCE M. WONG | @BRUCE_M_WONG LESSONS ABOUT FAILING WELL AND FAILING OFTEN
  2. 2. FAILURE HAPPENS BRUCE M. WONG | @BRUCE_M_WONG
  3. 3. “EVERYTHING FAILS ALL THE TIME” -WERNER VOGELS, CTO, AMAZON WEB SERVICES HTTP://THENEXTWEB.COM/2008/04/04/WERNER-VOGELS-EVERYTHING-FAILS-ALL-THE-TIME/ BRUCE M. WONG | @BRUCE_M_WONG
  4. 4. THE ORIGINAL CHAOS MONKEY CREATED BY NETFLIX CLOUD ARCHITECT, GREG ORZELL - @CHAOSSIMIA 2010 BRUCE M. WONG | @BRUCE_M_WONG HTTPS://WWW.LINKEDIN.COM/IN/GORZELL
  5. 5. a A STATE OF XEN AWS EC2 REBOOT, 2014 BRUCE M. WONG | @BRUCE_M_WONG
  6. 6. HTTP://XENBITS.XEN.ORG/XSA/ADVISORY-108.HTML HTTP://TECHBLOG.NETFLIX.COM/2014/10/A-STATE-OF-XEN-CHAOS-MONKEY-CASSANDRA.HTML HTTP://AWS.AMAZON.COM/BLOGS/AWS/EC2-MAINTENANCE-UPDATE/ 22COMPLETE NODE FAILURE 2700+ C* NODES, 218 REBOOTS 0DOWNTIME BRUCE M. WONG | @BRUCE_M_WONG
  7. 7. LESSON #1 : TRUST YOUR RESILIENCE BRUCE M. WONG | @BRUCE_M_WONG
  8. 8. SLOW IS HARD BRUCE M. WONG | @BRUCE_M_WONG
  9. 9. SLOW IS HARD BRUCE M. WONG | @BRUCE_M_WONG
  10. 10. UNBOUND QUEUES - ELASTIC ISN’T INFINITE BRUCE M. WONG | @BRUCE_M_WONG
  11. 11. UNBOUND QUEUES - ELASTIC ISN’T INFINITE BRUCE M. WONG | @BRUCE_M_WONG
  12. 12. SLOW IS HARD BRUCE M. WONG | @BRUCE_M_WONG
  13. 13. LATENCY MONKEY BRUCE M. WONG | @BRUCE_M_WONG
  14. 14. SLOW IS HARD BRUCE M. WONG | @BRUCE_M_WONG
  15. 15. LATENCY TESTING 2.0 - FIT HTTP://TECHBLOG.NETFLIX.COM/2014/10/FIT-FAILURE-INJECTION-TESTING.HTML BRUCE M. WONG | @BRUCE_M_WONG
  16. 16. SLOW IS HARD BRUCE M. WONG | @BRUCE_M_WONG
  17. 17. SLOW IS HARD START SLOW •ACCOUNT LEVEL •+10MS BEFORE +100MS •+1% ERRORS BEFORE +80% ERRORS DIAL IT UP •A -> D NOT * -> D BRUCE M. WONG | @BRUCE_M_WONG
  18. 18. LESSON # 2 : FIXING ONE FAILURE MODE EXPOSES NEW ONES BRUCE M. WONG | @BRUCE_M_WONG
  19. 19. WHATS SO SPECIAL ABOUT CHAOS BRUCE M. WONG | @BRUCE_M_WONG CHAOS IS A CHOICE
  20. 20. WHATS SO SPECIAL ABOUT CHAOS BRUCE M. WONG | @BRUCE_M_WONG OUTAGES VS CHAOS
  21. 21. BRUCE M. WONG | @BRUCE_M_WONG OUTAGES VS CHAOS Uncontrolled Controlled Unpredictable Scheduled Time to Detect: Minutes 0 Time to Detect Time to Resolve: ???? Time to Resolve: seconds* Analysis Time: ???? Root Cause Analysis: Intentional
  22. 22. MYTH OF RESILIENCE NATION’S BUSINESS, 1977 BRUCE M. WONG | @BRUCE_M_WONG
  23. 23. LATENCY MONKEY BRUCE M. WONG | @BRUCE_M_WONG
  24. 24. LESSON # 3 : THE CULTURE ASPECTS OF CHAOS ARE HARD BRUCE M. WONG | @BRUCE_M_WONG
  25. 25. BRUCE M. WONG | @BRUCE_M_WONG MOST ENTERPRISES HIRE PEOPLE TO FIX THINGS. NETFLIX HIRES PEOPLE TO BREAK THINGS…. …WE SHOULD EMBRACE NETFLIX'S CULTURE OF "CHAOS ENGINEERING" THROUGHOUT ORGANIZATIONS OF ALL SHAPES AND SIZES.
  26. 26. BRUCE M. WONG | @BRUCE_M_WONG
  27. 27. SEEK PROGRESS OVER PERFECTION TWILIO LEADERSHIP PRINCIPLE BRUCE M. WONG | @BRUCE_M_WONG
  28. 28. GAME DAYS - BENEFITS •Training New Engineers •Discover Instrumentation gaps •New Product Launches •Incident Management Practices BRUCE M. WONG | @BRUCE_M_WONG
  29. 29. GAME DAYS - THE SETUP •Two “on-call” teams •Separate rooms, separate slack channels •Master of Disaster •Incident Commander BRUCE M. WONG | @BRUCE_M_WONG
  30. 30. LEVERAGE EXISTING TESTBOTS •Functionally test fallback code •Early warning! •Existing Integrations with Telemetry, PagerDuty, Slack •Incorporate into Canary process FUTURE BRUCE M. WONG | @BRUCE_M_WONG
  31. 31. RECAP Lesson # 1 : Trust your resilience Lesson # 2 : Fixing one failure mode exposes new ones Lesson # 3 : The culture aspects of Chaos are HARD Get started today! Game Days are your friend - do them early and often Testbots + focus on developer productivity BRUCE M. WONG | @BRUCE_M_WONG
  32. 32. WHEN YOU WISH UPON A BLUE MOON BRUCE M. WONG | @BRUCE_M_WONG

×