SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
NETFLIX’S
CHAOS
MONKEY
Michael Whitehead
“EVERYTHING FAILS ALL THE TIME”
- WERNER VOGELS
CHAOS MONKEY
A service that causes failure and
wreaks havoc on instances in Auto
Scaling Groups
A member of the Simian Army
developed by Netflix
WHY WOULD WE INTENTIONALLY
CAUSE FAILURE?!?
 It is inevitable
 Infrastructure is Complex
 Forcing failure puts you in control
 Identify faults in your architecture
• Does you load balancers reroute traffic correctly?
• Do your instances function correctly when they come back up?
• Are you monitoring tools alerting you on important events?
GETTING STARTED WITH CHAOS MONKEY
 Amazon Web Services
 Must be using Auto Scaling Groups
 Uses Amazon SimpleDB for event storage
 Simple Email Service setup (optional for notifications)
 Can be used with Netflix’s Asgard (optional)
 Java 7 JDK or newer
WOW!
EXAMPLE WITH CLOUDFORMATION
NEAT!
AWESOME!
COOL!
NO WAY!
BUILDING & CONFIGURATION
 Clone SimianArmy repo from Github
 Builds using Gradle
 Runs 6 times a day during business hours- 9am to 3pm
 Does not run on holidays or weekends
 Timeframes and frequency of runs can be configured
IMPORTANT PROPERTIES
 Enabling Chaos Monkey
 Set simianarmy.chaos.enabled = true
 Set simianarmy.chaos.leashed=false
 Probability of 1 instance being terminated per day per ASG
 simianarmy.chaos.ASG.probability = 1.0
 Opt-in or Opt-out model
OPT-IN / OPT-OUT MODEL
 Set to False = Opt-in Set to True = Opt-out
 simianarmy.chaos.ASG.enabled = false
 When Opt-In (false) you must enable each auto scaling group you
want to run Chaos Monkey in
 simianarmy.chaos.<<auto scaling group name>>.enabled = true
 When Opt-Out (true) you must disable each auto scaling group
you do not want it to run in
 simianarmy.chaos.<<auto scaling group name>>.enabled = false
EMAIL NOTIFICATIONS
ARE TERMINATIONS ALL IT CAN DO?
 Block all network traffic
 Burn CPU
 Burn IO
 Fill Disk
 Kill Processes
 Network Loss
 Null-Route
• All EC2 <-> EC2 traffic
SSH REQUIRED
 Detach all EBS volumes
 Fail DNS
 Fail EC2 API
 Fail S3 API
 Fail DynamoDB API
 Network Corruption
 Network Latency
LINKS
 CloudFormation Template:
https://github.com/joehack3r/aws/blob/master/cloudformation/te
mplates/chaosMonkey.json
 Chaos Monkey Announcement:
http://techblog.netflix.com/2012/07/chaos-monkey-released-into-
wild.html
 Simian Army Quick Start Guide:
https://github.com/Netflix/SimianArmy/wiki/Quick-Start-Guide
 Chaos Monkey Configuration:
https://github.com/Netflix/SimianArmy/wiki/Chaos-Settings
 Chaos Monkey Army:
https://github.com/Netflix/SimianArmy/wiki/The-Chaos-Monkey-Army

Más contenido relacionado

Was ist angesagt?

Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Ryan Cuprak
 
ECS+Locust로 부하 테스트 진행하기
ECS+Locust로 부하 테스트 진행하기ECS+Locust로 부하 테스트 진행하기
ECS+Locust로 부하 테스트 진행하기Yungon Park
 
Microservices-DDD-Telosys-Devoxx-FR-2022
Microservices-DDD-Telosys-Devoxx-FR-2022Microservices-DDD-Telosys-Devoxx-FR-2022
Microservices-DDD-Telosys-Devoxx-FR-2022Laurent Guérin
 
카카오톡의 서버사이드 코틀린
카카오톡의 서버사이드 코틀린카카오톡의 서버사이드 코틀린
카카오톡의 서버사이드 코틀린if kakao
 
Akka Actor presentation
Akka Actor presentationAkka Actor presentation
Akka Actor presentationGene Chang
 
msr_以降のアーキテクチャ
msr_以降のアーキテクチャmsr_以降のアーキテクチャ
msr_以降のアーキテクチャdefault Takakuni
 
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪崇之 清水
 
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016Amazon Web Services Korea
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018cyberagent
 
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기Sangik Bae
 
AWS AutoScaling
AWS AutoScalingAWS AutoScaling
AWS AutoScalingMahesh Raj
 
Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기Amazon Web Services Korea
 
AWS Black Belt Online Seminar 2016 Amazon WorkSpaces
AWS Black Belt Online Seminar 2016 Amazon WorkSpacesAWS Black Belt Online Seminar 2016 Amazon WorkSpaces
AWS Black Belt Online Seminar 2016 Amazon WorkSpacesAmazon Web Services Japan
 
サーバーレスの常識を覆す Azure Durable Functionsを使い倒す
サーバーレスの常識を覆す Azure Durable Functionsを使い倒すサーバーレスの常識を覆す Azure Durable Functionsを使い倒す
サーバーレスの常識を覆す Azure Durable Functionsを使い倒すYuta Matsumura
 
Clean architectures with fast api pycones
Clean architectures with fast api   pyconesClean architectures with fast api   pycones
Clean architectures with fast api pyconesAlvaro Del Castillo
 
JUnit5 and TestContainers
JUnit5 and TestContainersJUnit5 and TestContainers
JUnit5 and TestContainersSunghyouk Bae
 
webservice scaling for newbie
webservice scaling for newbiewebservice scaling for newbie
webservice scaling for newbieDaeMyung Kang
 

Was ist angesagt? (20)

Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)
 
ECS+Locust로 부하 테스트 진행하기
ECS+Locust로 부하 테스트 진행하기ECS+Locust로 부하 테스트 진행하기
ECS+Locust로 부하 테스트 진행하기
 
Microservices-DDD-Telosys-Devoxx-FR-2022
Microservices-DDD-Telosys-Devoxx-FR-2022Microservices-DDD-Telosys-Devoxx-FR-2022
Microservices-DDD-Telosys-Devoxx-FR-2022
 
카카오톡의 서버사이드 코틀린
카카오톡의 서버사이드 코틀린카카오톡의 서버사이드 코틀린
카카오톡의 서버사이드 코틀린
 
Akka Actor presentation
Akka Actor presentationAkka Actor presentation
Akka Actor presentation
 
msr_以降のアーキテクチャ
msr_以降のアーキテクチャmsr_以降のアーキテクチャ
msr_以降のアーキテクチャ
 
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
 
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016
VPC를 위한 Hybrid 클라우드 보안 :: 김민석 :: AWS Summit Seoul 2016
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
 
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기
golang과 websocket을 활용한 서버프로그래밍 - 장애없는 서버 런칭 도전기
 
AWS AutoScaling
AWS AutoScalingAWS AutoScaling
AWS AutoScaling
 
Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기Amazon EFS (Elastic File System) 이해하고사용하기
Amazon EFS (Elastic File System) 이해하고사용하기
 
AWS Black Belt Online Seminar 2016 Amazon WorkSpaces
AWS Black Belt Online Seminar 2016 Amazon WorkSpacesAWS Black Belt Online Seminar 2016 Amazon WorkSpaces
AWS Black Belt Online Seminar 2016 Amazon WorkSpaces
 
Reactive Java (33rd Degree)
Reactive Java (33rd Degree)Reactive Java (33rd Degree)
Reactive Java (33rd Degree)
 
サーバーレスの常識を覆す Azure Durable Functionsを使い倒す
サーバーレスの常識を覆す Azure Durable Functionsを使い倒すサーバーレスの常識を覆す Azure Durable Functionsを使い倒す
サーバーレスの常識を覆す Azure Durable Functionsを使い倒す
 
Clean architectures with fast api pycones
Clean architectures with fast api   pyconesClean architectures with fast api   pycones
Clean architectures with fast api pycones
 
Amazon DynamoDB Advanced Design Pattern
Amazon DynamoDB Advanced Design PatternAmazon DynamoDB Advanced Design Pattern
Amazon DynamoDB Advanced Design Pattern
 
pyspark.pdf
pyspark.pdfpyspark.pdf
pyspark.pdf
 
JUnit5 and TestContainers
JUnit5 and TestContainersJUnit5 and TestContainers
JUnit5 and TestContainers
 
webservice scaling for newbie
webservice scaling for newbiewebservice scaling for newbie
webservice scaling for newbie
 

Andere mochten auch

ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012Amazon Web Services
 
Netflix security monkey overview
Netflix security monkey overviewNetflix security monkey overview
Netflix security monkey overviewRyan Hodgin
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Jeremy Edberg
 
Release the Monkeys ! Testing in the Wild at Netflix
Release the Monkeys !  Testing in the Wild at NetflixRelease the Monkeys !  Testing in the Wild at Netflix
Release the Monkeys ! Testing in the Wild at NetflixGareth Bowles
 
Architecture for the cloud deployment case study future
Architecture for the cloud deployment case study futureArchitecture for the cloud deployment case study future
Architecture for the cloud deployment case study futureLen Bass
 
Jfokus 2015 - Immutable Server generation: the new App Deployment
Jfokus 2015 - Immutable Server generation: the new App DeploymentJfokus 2015 - Immutable Server generation: the new App Deployment
Jfokus 2015 - Immutable Server generation: the new App DeploymentAxel Fontaine
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systemsLen Bass
 
#ALSummit: Alert Logic & AWS - AWS Security Services
#ALSummit: Alert Logic & AWS - AWS Security Services#ALSummit: Alert Logic & AWS - AWS Security Services
#ALSummit: Alert Logic & AWS - AWS Security ServicesAlert Logic
 
Elements of User Experience for Mobile Apps
Elements of User Experience for Mobile AppsElements of User Experience for Mobile Apps
Elements of User Experience for Mobile AppsPek Pongpaet
 
From Sketch Mockup → WatchKit App
From Sketch Mockup → WatchKit AppFrom Sketch Mockup → WatchKit App
From Sketch Mockup → WatchKit AppPek Pongpaet
 
Continuous Delivery: Playing with Immutable servers @commitporto 2016
Continuous Delivery: Playing with Immutable servers @commitporto 2016Continuous Delivery: Playing with Immutable servers @commitporto 2016
Continuous Delivery: Playing with Immutable servers @commitporto 2016João Cravo
 
Cloud Security At Netflix, October 2013
Cloud Security At Netflix, October 2013Cloud Security At Netflix, October 2013
Cloud Security At Netflix, October 2013Jay Zarfoss
 
Practical Security Automation
Practical Security AutomationPractical Security Automation
Practical Security AutomationJason Chan
 
#ALSummit: Realities of Security in the Cloud
#ALSummit: Realities of Security in the Cloud#ALSummit: Realities of Security in the Cloud
#ALSummit: Realities of Security in the CloudAlert Logic
 
Principles Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering HamburgPrinciples Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering HamburgNils Meder
 
From Code to the Monkeys: Continuous Delivery at Netflix
From Code to the Monkeys: Continuous Delivery at NetflixFrom Code to the Monkeys: Continuous Delivery at Netflix
From Code to the Monkeys: Continuous Delivery at NetflixDianne Marsh
 
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...Amazon Web Services
 
Full Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The ForemanFull Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The ForemanWeston Bassler
 

Andere mochten auch (20)

ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
 
Mini-Training: Netflix Simian Army
Mini-Training: Netflix Simian ArmyMini-Training: Netflix Simian Army
Mini-Training: Netflix Simian Army
 
Netflix security monkey overview
Netflix security monkey overviewNetflix security monkey overview
Netflix security monkey overview
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
Release the Monkeys ! Testing in the Wild at Netflix
Release the Monkeys !  Testing in the Wild at NetflixRelease the Monkeys !  Testing in the Wild at Netflix
Release the Monkeys ! Testing in the Wild at Netflix
 
Architecture for the cloud deployment case study future
Architecture for the cloud deployment case study futureArchitecture for the cloud deployment case study future
Architecture for the cloud deployment case study future
 
Jfokus 2015 - Immutable Server generation: the new App Deployment
Jfokus 2015 - Immutable Server generation: the new App DeploymentJfokus 2015 - Immutable Server generation: the new App Deployment
Jfokus 2015 - Immutable Server generation: the new App Deployment
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systems
 
#ALSummit: Alert Logic & AWS - AWS Security Services
#ALSummit: Alert Logic & AWS - AWS Security Services#ALSummit: Alert Logic & AWS - AWS Security Services
#ALSummit: Alert Logic & AWS - AWS Security Services
 
Elements of User Experience for Mobile Apps
Elements of User Experience for Mobile AppsElements of User Experience for Mobile Apps
Elements of User Experience for Mobile Apps
 
From Sketch Mockup → WatchKit App
From Sketch Mockup → WatchKit AppFrom Sketch Mockup → WatchKit App
From Sketch Mockup → WatchKit App
 
Continuous Delivery: Playing with Immutable servers @commitporto 2016
Continuous Delivery: Playing with Immutable servers @commitporto 2016Continuous Delivery: Playing with Immutable servers @commitporto 2016
Continuous Delivery: Playing with Immutable servers @commitporto 2016
 
presentation-chaos-monkey
presentation-chaos-monkeypresentation-chaos-monkey
presentation-chaos-monkey
 
Cloud Security At Netflix, October 2013
Cloud Security At Netflix, October 2013Cloud Security At Netflix, October 2013
Cloud Security At Netflix, October 2013
 
Practical Security Automation
Practical Security AutomationPractical Security Automation
Practical Security Automation
 
#ALSummit: Realities of Security in the Cloud
#ALSummit: Realities of Security in the Cloud#ALSummit: Realities of Security in the Cloud
#ALSummit: Realities of Security in the Cloud
 
Principles Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering HamburgPrinciples Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering Hamburg
 
From Code to the Monkeys: Continuous Delivery at Netflix
From Code to the Monkeys: Continuous Delivery at NetflixFrom Code to the Monkeys: Continuous Delivery at Netflix
From Code to the Monkeys: Continuous Delivery at Netflix
 
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
How Netflix’s Tools Can Help Accelerate Your Start-up (SVC202) | AWS re:Inven...
 
Full Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The ForemanFull Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The Foreman
 

Ähnlich wie Intro to Netflix's Chaos Monkey

Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaCloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaAmazon Web Services
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...Puppet
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsTomas Doran
 
Practical Cloud & Workflow Orchestration
Practical Cloud & Workflow OrchestrationPractical Cloud & Workflow Orchestration
Practical Cloud & Workflow OrchestrationChris Dagdigian
 
Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Yan Cui
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsBenjamin Cane
 
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...Claus Ibsen
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and DrupalPromet Source
 
Automating Perl deployments with Hudson
Automating Perl deployments with HudsonAutomating Perl deployments with Hudson
Automating Perl deployments with Hudsonnachbaur
 
Paris Kafka Meetup - patterns anti-patterns
Paris Kafka Meetup -  patterns anti-patternsParis Kafka Meetup -  patterns anti-patterns
Paris Kafka Meetup - patterns anti-patternsFlorent Ramiere
 
Advanced front-end automation with npm scripts
Advanced front-end automation with npm scriptsAdvanced front-end automation with npm scripts
Advanced front-end automation with npm scriptsk88hudson
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeSarah Z
 
Chaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in ProductionChaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in ProductionKeet Sugathadasa
 
I Don't Test Often ...
I Don't Test Often ...I Don't Test Often ...
I Don't Test Often ...Gareth Bowles
 
I don't always test...but when I do I test in production - Gareth Bowles
I don't always test...but when I do I test in production - Gareth BowlesI don't always test...but when I do I test in production - Gareth Bowles
I don't always test...but when I do I test in production - Gareth BowlesQA or the Highway
 
MongoDB, Cloudformation and Chef
MongoDB, Cloudformation and ChefMongoDB, Cloudformation and Chef
MongoDB, Cloudformation and ChefMongoDB
 

Ähnlich wie Intro to Netflix's Chaos Monkey (20)

Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh VariaCloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
Cloud-powered Continuous Integration and Deployment architectures - Jinesh Varia
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...
Puppet Camp London 2014: Chasing AMI: baking Amazon machine images with Jenki...
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
 
Practical Cloud & Workflow Orchestration
Practical Cloud & Workflow OrchestrationPractical Cloud & Workflow Orchestration
Practical Cloud & Workflow Orchestration
 
Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environments
 
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and Drupal
 
Automating Perl deployments with Hudson
Automating Perl deployments with HudsonAutomating Perl deployments with Hudson
Automating Perl deployments with Hudson
 
Paris Kafka Meetup - patterns anti-patterns
Paris Kafka Meetup -  patterns anti-patternsParis Kafka Meetup -  patterns anti-patterns
Paris Kafka Meetup - patterns anti-patterns
 
Advanced front-end automation with npm scripts
Advanced front-end automation with npm scriptsAdvanced front-end automation with npm scripts
Advanced front-end automation with npm scripts
 
ChaosEngineeringITEA.pptx
ChaosEngineeringITEA.pptxChaosEngineeringITEA.pptx
ChaosEngineeringITEA.pptx
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less Coffee
 
Chaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in ProductionChaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in Production
 
I Don't Test Often ...
I Don't Test Often ...I Don't Test Often ...
I Don't Test Often ...
 
I don't always test...but when I do I test in production - Gareth Bowles
I don't always test...but when I do I test in production - Gareth BowlesI don't always test...but when I do I test in production - Gareth Bowles
I don't always test...but when I do I test in production - Gareth Bowles
 
How to Design for High Availability & Scale with AWS
How to Design for High Availability & Scale with AWSHow to Design for High Availability & Scale with AWS
How to Design for High Availability & Scale with AWS
 
MongoDB, Cloudformation and Chef
MongoDB, Cloudformation and ChefMongoDB, Cloudformation and Chef
MongoDB, Cloudformation and Chef
 

Último

UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FESTBillieHyde
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 

Último (20)

UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FEST
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 

Intro to Netflix's Chaos Monkey

  • 2. “EVERYTHING FAILS ALL THE TIME” - WERNER VOGELS
  • 3. CHAOS MONKEY A service that causes failure and wreaks havoc on instances in Auto Scaling Groups A member of the Simian Army developed by Netflix
  • 4. WHY WOULD WE INTENTIONALLY CAUSE FAILURE?!?  It is inevitable  Infrastructure is Complex  Forcing failure puts you in control  Identify faults in your architecture • Does you load balancers reroute traffic correctly? • Do your instances function correctly when they come back up? • Are you monitoring tools alerting you on important events?
  • 5. GETTING STARTED WITH CHAOS MONKEY  Amazon Web Services  Must be using Auto Scaling Groups  Uses Amazon SimpleDB for event storage  Simple Email Service setup (optional for notifications)  Can be used with Netflix’s Asgard (optional)  Java 7 JDK or newer
  • 7. BUILDING & CONFIGURATION  Clone SimianArmy repo from Github  Builds using Gradle  Runs 6 times a day during business hours- 9am to 3pm  Does not run on holidays or weekends  Timeframes and frequency of runs can be configured
  • 8. IMPORTANT PROPERTIES  Enabling Chaos Monkey  Set simianarmy.chaos.enabled = true  Set simianarmy.chaos.leashed=false  Probability of 1 instance being terminated per day per ASG  simianarmy.chaos.ASG.probability = 1.0  Opt-in or Opt-out model
  • 9. OPT-IN / OPT-OUT MODEL  Set to False = Opt-in Set to True = Opt-out  simianarmy.chaos.ASG.enabled = false  When Opt-In (false) you must enable each auto scaling group you want to run Chaos Monkey in  simianarmy.chaos.<<auto scaling group name>>.enabled = true  When Opt-Out (true) you must disable each auto scaling group you do not want it to run in  simianarmy.chaos.<<auto scaling group name>>.enabled = false
  • 11. ARE TERMINATIONS ALL IT CAN DO?  Block all network traffic  Burn CPU  Burn IO  Fill Disk  Kill Processes  Network Loss  Null-Route • All EC2 <-> EC2 traffic SSH REQUIRED  Detach all EBS volumes  Fail DNS  Fail EC2 API  Fail S3 API  Fail DynamoDB API  Network Corruption  Network Latency
  • 12. LINKS  CloudFormation Template: https://github.com/joehack3r/aws/blob/master/cloudformation/te mplates/chaosMonkey.json  Chaos Monkey Announcement: http://techblog.netflix.com/2012/07/chaos-monkey-released-into- wild.html  Simian Army Quick Start Guide: https://github.com/Netflix/SimianArmy/wiki/Quick-Start-Guide  Chaos Monkey Configuration: https://github.com/Netflix/SimianArmy/wiki/Chaos-Settings  Chaos Monkey Army: https://github.com/Netflix/SimianArmy/wiki/The-Chaos-Monkey-Army