Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

TechEvent 2019: Chaos Engineering - here we go; Lothar Wieske - Trivadis

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 16 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie TechEvent 2019: Chaos Engineering - here we go; Lothar Wieske - Trivadis (20)

Anzeige

Weitere von Trivadis (20)

Aktuellste (20)

Anzeige

TechEvent 2019: Chaos Engineering - here we go; Lothar Wieske - Trivadis

  1. 1. news.trivadis.com/blog@lwieske Chaos Engineering Here We Go Lothar
  2. 2. Lothar I am solutions architect and digital disruptor. Since 2009, I work at the intersection between cloud and analytics. Digital disruption is coming to ever more sectors and I want to understand its technological, societal and economical impacts. Before 2009, I managed large project budgets, turned to an architect later on and built a digital radiology and migrated the Miles & More. @lwieske news.trivadis.com/blog
  3. 3. Cloud Computing and Cloud Native
  4. 4. “The cloud isn’t a place, it’s a way of doing IT.” Michael Dell
  5. 5. Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.
  6. 6. Chaos Engineering
  7. 7. Werner Vogels Adrian Cockcroft
  8. 8. 2012: Netflix Open Sourced Chaos Monkey. 2016: Netflix Completed Transition To a 100% AWS Infrastructure Cloud Changed the Way Netflix Runs the Company
  9. 9. Netflix Handled Amazon Maintenance Update • Amazon performed a major maintenance update at the end of September 2014 in order to patch a security vulnerability in a Xen hypervisor affecting about 10% of their global fleet of cloud servers. • Netflix has a long history of using their Simian army - Chaos Monkey, Gorilla and Kong – to force reboots of their servers in order to see how the overall system reacts and what can be done to improve resilience. The problem this time was that the operation would affect some of their database servers, more exactly 218 Cassandra nodes. It is one thing to perform a live restart of a server streaming a video, and it is a lot more difficult to do the same to a stateful database. • Out of our 2700+ production Cassandra nodes, 218 were rebooted. • 22 Cassandra nodes were on hardware that did not reboot successfully. • They were detected and replaced with minimal human intervention. • Netflix experienced 0 downtime that weekend.
  10. 10. Infrastructure Switching Application People ToolsTools Tools Chaos Engineering Team Security Red Team
  11. 11. Apache License 2.0, https://commons.wikimedia.org/w/index.php?curid=63503083
  12. 12. PRINCIPLES OF CHAOS ENGINEERING • The following principles describe an ideal application of Chaos Engineering, applied to the processes of experimentation described above. The degree to which these principles are pursued strongly correlates to the confidence we can have in a distributed system at scale. • Build a Hypothesis around Steady State Behavior • Vary Real-world Events • Run Experiments in Production • Automate Experiments to Run Continuously • Minimize Blast Radius • Experimenting in production has the potential to cause unnecessary customer pain. While there must be an allowance for some short-term negative impact, it is the responsibility and obligation of the Chaos Engineer to ensure the fallout from experiments are minimized and contained.
  13. 13. Chaos Engineering Is Not Just Tools. Culture Is Part Of Your System. Complexity Is Part Of Your System. Testing In Production? Yes You Can! You Should Chaos Engineer Everything Cloud and Microservices – Among Others
  14. 14. Integration Workshops Orientation Workshops Elaboration Workshops Conception Workshops Cloud Native Leadership Cloud Native Apps Cloud Native Architectures Teams & Skills DevOps Cloud Native Data Cloud Native Journey Cloud Native Landscape Walkthrough Cloud Native Security Cloud Native Lighthouse

×