SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
CONTINUOUS
DELIVERY WHILE
MINIMISING
PERFORMANCE
RISKS
INTRODUCTION
OBJECTIVE

Put working software into production as quickly as possible,
whilst minimising risk of load-related problems:

›   Bad response times
›   Too small capacity
›   Availability too low
›   Excessive system resource use
PREVENTING RISK IS A BIG SUBJECT,
WHAT FOLLOWS IS TAKEN FROM OUR EXPERIENCE
RISK PREVENTION IS A BIG SUBJECT




     Photo by chillihead: www.flickr.com/photos/chillihead/1778980935
CONTINUOUS DELIVERY LITERATURE PROVIDES METHODS
THAT HELP REDUCE RISK

›   Blue-green deployments
›   Dark launching
›   Feature toggles
›   Canary releasing
›   Production immune systems




                       Jez Humble, http://continuousdelivery.com
BLUE-GREEN DEPLOYMENTS




                    Elastic     Instances
                     Load
                   Balancer   Version n



        Amazon
        Route 53




                    Elastic    Instances
                     Load
                   Balancer   Version n+1
DARK LAUNCHING
                 Web page   DB
DARK LAUNCHING
                 Web page   DB   Weather SP
DARK LAUNCHING
                 Web page   DB   Weather SP
FEATURE TOGGLES
CANARY RELEASING




                   0%   100%
PRODUCTION IMMUNE SYSTEMS
USE CONTROLLED LOAD TESTING TO HELP CAPACITY
PLANNING


                       Instance           RDS DB
                                          Instance




 Amazon      Elastic    Instance
 Route 53     Load
            Balancer




                        Instance       RDS DB Instance
                                        Read Replica
WORK WITH FAILURE

›   Optimise for MTTD and MTTR, not MTBF
›   Game day exercises
›   Chaos monkey
›   Go / NoGo meetings
›   Retrospectives
BUT LEGACY SYSTEMS OFTEN LACK THE REQUIRED
RESILIENCE
WHILE WE WORK ON OUR RESILIENCE, WE USE LOAD TESTS
TO HELP IDENTIFY THE BIGGEST RISKS
PRE-PROD LOAD TESTING IS NOT FREE


›   Extra code to maintain

›   Usually test runs last several hours

›   A production-like environment is expensive

›   Realistic testing is hard

›   Not all developers like writing (performance) tests
USE IT WISELY, WHERE PRODUCTION TESTING IS STILL
INAPPROPRIATE

›   It provides no guarantee

›   Use it to find any showstoppers you can

›   Essentially, an optional service that teams can use
USE IT AS A PLAYGROUND TO TRY RISKY CHANGES




        Photo by vastateparksstaff: www.flickr.com/photos/vastateparksstaff/5330257235
Load tests


              Functional
Build, unit
              integration
 test, etc.
                 tests


Very often    Less often    At least once a day
                                  (at night)
Load tests


              Functional
Build, unit                      Load test
              integration
 test, etc.                     script check
                 tests


Very often              Less often             At least once a day
                                                     (at night)
THE AIM IS NOT PERFECTION, GO FOR “AS REALISTIC AS
NEEDED”
SET UP TEST DATA IN THE WEEKEND, TO MINIMIZE
DISRUPTION
WHEN IS A PROBLEM REALLY A PROBLEM?
FIND AN OBJECTIVE WAY TO JUDGE YOUR FINDINGS
ESTABLISH REQUIREMENTS TO MAKE CLEAR WHAT IS
ACCEPTABLE

›   Seen from the main stakeholders’ perspective
    – Response time: users
    – System resources: ops
    – Capacity: business
›   Specific
›   Measurable
›   Achievable
›   Relevant
Concurrent users



                             Fail:              Now:            Target:
                             < 100k             150k            200k




Intention: The website should at least be    Stakeholder: Business
able to manage our typical daily load, but
we would like some margin for growth
and marketing campaigns.

Scale: Maximum load in a day, while          Meter: Session table row count.
response times are still according to
spec.
SO USE A REAL BROWSER TO TEST
A REAL USER’S EXPERIENCE
Response time   Fail      [Today]    Target
Homepage.FV     > 6 sec    3.9 sec    2 sec
Homepage.RV     > 5 sec    2.8 sec    1 sec
Checkout.FV     > 8 sec    6.5 sec    2 sec
Details.FV      > 6 sec    1.9 sec    2 sec
Details.RV      > 5 sec    1.7 sec    1 sec
Search.FV       > 6 sec    4.8 sec    2 sec
Search.RV       > 5 sec    3.7 sec    1 sec
Cart.FV         > 6 sec    4.4 sec    2 sec
Cart.RV         > 5 sec    3.4 sec    1 sec
LoginForm.FV    > 6 sec    3.5 sec    2 sec
LoginForm.RV    > 5 sec    2.5 sec    1 sec
TO MAKE COMPARING SENSIBLE, MAKE YOUR TESTS
DETERMINISTIC

Stub systems that you have no control over
LOAD TESTING SHOULD BE OPTIONAL, THE ONLY THING
THAT COUNTS IS PRODUCTION!

›   Your definition of done should reflect that

›   The aim is to get early feedback from a safe environment
ANYTHING YOU FIND IS AN OPPORTUNITY TO FIX MORE
THAN ONE PROBLEM
SO WHAT MONITORING IS TYPICALLY NEEDED?

›   Be able to localise where latency is coming from!
    – For every system, all incoming and outgoing calls (count and
       time spent stats)
›   Finite resources (pools, CPU, I/O, etc.)
›   Number of active users
›   Response size, where possible
›   Add whatever you need


It should be identical on all environments!
CONCLUSION

In order to put code live without pre-prod load testing, at least
the following need to be in place:
› Culture
› State-of-the-art monitoring
› Resilience
Without these, support your continuous delivery process with
optional load tests and strong specs.

Use the load tests to identify some pain points, so you can
modify the code and add monitoring, making it safer to do
(incremental) dark releases and canary testing in production.
QUESTIONS?



             athomas@xebia.com
             @a32an
             www.xebia.com
             blog.xebia.com

             (we’re hiring)

Weitere ähnliche Inhalte

Was ist angesagt?

Asynchronous Multiplayer on Mobile Network
Asynchronous Multiplayer on Mobile NetworkAsynchronous Multiplayer on Mobile Network
Asynchronous Multiplayer on Mobile NetworkIvan Dolgushin
 
Release the Monkeys ! Testing in the Wild at Netflix
Release the Monkeys !  Testing in the Wild at NetflixRelease the Monkeys !  Testing in the Wild at Netflix
Release the Monkeys ! Testing in the Wild at NetflixGareth Bowles
 
Scaling Up Continuous Deployment
Scaling Up Continuous DeploymentScaling Up Continuous Deployment
Scaling Up Continuous DeploymentTimothy Fitz
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019Johan Norm
 
Continuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryContinuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryTimothy Fitz
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017Michael Frank
 
Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»DataArt
 
Continuous Delivery and the Cloud
Continuous Delivery and the CloudContinuous Delivery and the Cloud
Continuous Delivery and the CloudNigel Fernandes
 
An Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing DatameerAn Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing DatameerColin Brown
 
Shadowing production requests
Shadowing production requestsShadowing production requests
Shadowing production requestsJakauteri
 
Continuous Testing in the Agile Age
Continuous Testing in the Agile AgeContinuous Testing in the Agile Age
Continuous Testing in the Agile AgeBlazeMeter
 
Docker-native Automated Delivery w/ Caylent
Docker-native Automated Delivery w/ CaylentDocker-native Automated Delivery w/ Caylent
Docker-native Automated Delivery w/ CaylentJP La Torre
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseChristian McHugh
 
High performance in react native
High performance in react nativeHigh performance in react native
High performance in react nativeViet Tran
 
Five (easy?) Steps Towards Continuous Delivery
Five (easy?) Steps Towards Continuous DeliveryFive (easy?) Steps Towards Continuous Delivery
Five (easy?) Steps Towards Continuous DeliveryEberhard Wolff
 
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...Andreas Grabner
 
How Fast is Your Java Code
How Fast is Your Java CodeHow Fast is Your Java Code
How Fast is Your Java CodeDmitry Buzdin
 
Speed up your Serverless development flow
Speed up your Serverless development flowSpeed up your Serverless development flow
Speed up your Serverless development flowEfi Merdler-Kravitz
 

Was ist angesagt? (20)

Asynchronous Multiplayer on Mobile Network
Asynchronous Multiplayer on Mobile NetworkAsynchronous Multiplayer on Mobile Network
Asynchronous Multiplayer on Mobile Network
 
Release the Monkeys ! Testing in the Wild at Netflix
Release the Monkeys !  Testing in the Wild at NetflixRelease the Monkeys !  Testing in the Wild at Netflix
Release the Monkeys ! Testing in the Wild at Netflix
 
Scaling Up Continuous Deployment
Scaling Up Continuous DeploymentScaling Up Continuous Deployment
Scaling Up Continuous Deployment
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019
 
Continuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryContinuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous Delivery
 
Azure Nights August2017
Azure Nights August2017Azure Nights August2017
Azure Nights August2017
 
Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»
 
Continuous Delivery and the Cloud
Continuous Delivery and the CloudContinuous Delivery and the Cloud
Continuous Delivery and the Cloud
 
An Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing DatameerAn Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing Datameer
 
Load Testing with JMeter, BlazeMeter, New Relic
Load Testing with JMeter, BlazeMeter, New RelicLoad Testing with JMeter, BlazeMeter, New Relic
Load Testing with JMeter, BlazeMeter, New Relic
 
Shadowing production requests
Shadowing production requestsShadowing production requests
Shadowing production requests
 
Continuous Testing in the Agile Age
Continuous Testing in the Agile AgeContinuous Testing in the Agile Age
Continuous Testing in the Agile Age
 
Docker-native Automated Delivery w/ Caylent
Docker-native Automated Delivery w/ CaylentDocker-native Automated Delivery w/ Caylent
Docker-native Automated Delivery w/ Caylent
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterprise
 
High performance in react native
High performance in react nativeHigh performance in react native
High performance in react native
 
Migrating big data
Migrating big dataMigrating big data
Migrating big data
 
Five (easy?) Steps Towards Continuous Delivery
Five (easy?) Steps Towards Continuous DeliveryFive (easy?) Steps Towards Continuous Delivery
Five (easy?) Steps Towards Continuous Delivery
 
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...
Performance Metrics for your Build Pipeline - presented at Vienna WebPerf Oct...
 
How Fast is Your Java Code
How Fast is Your Java CodeHow Fast is Your Java Code
How Fast is Your Java Code
 
Speed up your Serverless development flow
Speed up your Serverless development flowSpeed up your Serverless development flow
Speed up your Serverless development flow
 

Andere mochten auch

3 negative effects of climate change
3  negative effects of climate change3  negative effects of climate change
3 negative effects of climate changehfonfe
 
Analysing articles double page spread
Analysing articles double page spreadAnalysing articles double page spread
Analysing articles double page spreadasmediag12
 
2)design idea’s for my music magazine
2)design idea’s for my music magazine2)design idea’s for my music magazine
2)design idea’s for my music magazineasmediag12
 
Ways to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliveryWays to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliverya32an
 

Andere mochten auch (6)

3 negative effects of climate change
3  negative effects of climate change3  negative effects of climate change
3 negative effects of climate change
 
Analysing articles double page spread
Analysing articles double page spreadAnalysing articles double page spread
Analysing articles double page spread
 
My Genre
My GenreMy Genre
My Genre
 
2)design idea’s for my music magazine
2)design idea’s for my music magazine2)design idea’s for my music magazine
2)design idea’s for my music magazine
 
Q7
Q7Q7
Q7
 
Ways to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliveryWays to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous delivery
 

Ähnlich wie Continuous delivery while minimizing performance risks (dutch web ops meetup)

[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise ApplicationsDaniel Oh
 
Ground rules
Ground rulesGround rules
Ground rulesLior Sion
 
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012Amazon Web Services
 
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...Cωνσtantίnoς Giannoulis
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...Amazon Web Services
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web ApplicationsDavid Mitzenmacher
 
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...LetAgileFly
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Amazon Web Services
 
AWS Lambda from the Trenches
AWS Lambda from the TrenchesAWS Lambda from the Trenches
AWS Lambda from the TrenchesYan Cui
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDSAmazon Web Services
 
Releasing fast code - The DevOps approach
Releasing fast code - The DevOps approachReleasing fast code - The DevOps approach
Releasing fast code - The DevOps approachMichael Kopp
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing Ran Levy
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and DrupalPromet Source
 
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Jelastic Multi-Cloud PaaS
 
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOpsFuture Processing
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...Amazon Web Services
 
DevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleDevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleArnaud LEMAIRE
 

Ähnlich wie Continuous delivery while minimizing performance risks (dutch web ops meetup) (20)

[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
 
Ground rules
Ground rulesGround rules
Ground rules
 
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012
CPN208 Failures at Scale & How to Ride Through Them - AWS re: Invent 2012
 
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...
Lessons Learned in Software Development: QA Infrastructure – Maintaining Rob...
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
 
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...
Scrum Gathering 2012 Shanghai_敏捷测试与质量管理分会场演讲话题:getting to done by testing at ...
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
AWS Lambda from the Trenches
AWS Lambda from the TrenchesAWS Lambda from the Trenches
AWS Lambda from the Trenches
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDS
 
Continuous integration at CartoDB March '16
Continuous integration at CartoDB March '16Continuous integration at CartoDB March '16
Continuous integration at CartoDB March '16
 
Releasing fast code - The DevOps approach
Releasing fast code - The DevOps approachReleasing fast code - The DevOps approach
Releasing fast code - The DevOps approach
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
 
AutoScaling and Drupal
AutoScaling and DrupalAutoScaling and Drupal
AutoScaling and Drupal
 
JEEconf 2017
JEEconf 2017JEEconf 2017
JEEconf 2017
 
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
Automated scaling of microservice stacks for JavaEE applications - JEEConf 2017
 
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps
[QE 2017] Dawid Pacia, Tomasz Janiszewski - SQA w erze TestOps
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 
DevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleDevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & Ansible
 

Continuous delivery while minimizing performance risks (dutch web ops meetup)

  • 3. OBJECTIVE Put working software into production as quickly as possible, whilst minimising risk of load-related problems: › Bad response times › Too small capacity › Availability too low › Excessive system resource use
  • 4.
  • 5. PREVENTING RISK IS A BIG SUBJECT, WHAT FOLLOWS IS TAKEN FROM OUR EXPERIENCE RISK PREVENTION IS A BIG SUBJECT Photo by chillihead: www.flickr.com/photos/chillihead/1778980935
  • 6. CONTINUOUS DELIVERY LITERATURE PROVIDES METHODS THAT HELP REDUCE RISK › Blue-green deployments › Dark launching › Feature toggles › Canary releasing › Production immune systems Jez Humble, http://continuousdelivery.com
  • 7. BLUE-GREEN DEPLOYMENTS Elastic Instances Load Balancer Version n Amazon Route 53 Elastic Instances Load Balancer Version n+1
  • 8. DARK LAUNCHING Web page DB
  • 9. DARK LAUNCHING Web page DB Weather SP
  • 10. DARK LAUNCHING Web page DB Weather SP
  • 12. CANARY RELEASING 0% 100%
  • 14. USE CONTROLLED LOAD TESTING TO HELP CAPACITY PLANNING Instance RDS DB Instance Amazon Elastic Instance Route 53 Load Balancer Instance RDS DB Instance Read Replica
  • 15. WORK WITH FAILURE › Optimise for MTTD and MTTR, not MTBF › Game day exercises › Chaos monkey › Go / NoGo meetings › Retrospectives
  • 16. BUT LEGACY SYSTEMS OFTEN LACK THE REQUIRED RESILIENCE
  • 17. WHILE WE WORK ON OUR RESILIENCE, WE USE LOAD TESTS TO HELP IDENTIFY THE BIGGEST RISKS
  • 18. PRE-PROD LOAD TESTING IS NOT FREE › Extra code to maintain › Usually test runs last several hours › A production-like environment is expensive › Realistic testing is hard › Not all developers like writing (performance) tests
  • 19.
  • 20. USE IT WISELY, WHERE PRODUCTION TESTING IS STILL INAPPROPRIATE › It provides no guarantee › Use it to find any showstoppers you can › Essentially, an optional service that teams can use
  • 21. USE IT AS A PLAYGROUND TO TRY RISKY CHANGES Photo by vastateparksstaff: www.flickr.com/photos/vastateparksstaff/5330257235
  • 22. Load tests Functional Build, unit integration test, etc. tests Very often Less often At least once a day (at night)
  • 23. Load tests Functional Build, unit Load test integration test, etc. script check tests Very often Less often At least once a day (at night)
  • 24. THE AIM IS NOT PERFECTION, GO FOR “AS REALISTIC AS NEEDED”
  • 25. SET UP TEST DATA IN THE WEEKEND, TO MINIMIZE DISRUPTION
  • 26. WHEN IS A PROBLEM REALLY A PROBLEM?
  • 27. FIND AN OBJECTIVE WAY TO JUDGE YOUR FINDINGS
  • 28. ESTABLISH REQUIREMENTS TO MAKE CLEAR WHAT IS ACCEPTABLE › Seen from the main stakeholders’ perspective – Response time: users – System resources: ops – Capacity: business › Specific › Measurable › Achievable › Relevant
  • 29. Concurrent users Fail: Now: Target: < 100k 150k 200k Intention: The website should at least be Stakeholder: Business able to manage our typical daily load, but we would like some margin for growth and marketing campaigns. Scale: Maximum load in a day, while Meter: Session table row count. response times are still according to spec.
  • 30. SO USE A REAL BROWSER TO TEST A REAL USER’S EXPERIENCE
  • 31. Response time Fail [Today] Target Homepage.FV > 6 sec 3.9 sec 2 sec Homepage.RV > 5 sec 2.8 sec 1 sec Checkout.FV > 8 sec 6.5 sec 2 sec Details.FV > 6 sec 1.9 sec 2 sec Details.RV > 5 sec 1.7 sec 1 sec Search.FV > 6 sec 4.8 sec 2 sec Search.RV > 5 sec 3.7 sec 1 sec Cart.FV > 6 sec 4.4 sec 2 sec Cart.RV > 5 sec 3.4 sec 1 sec LoginForm.FV > 6 sec 3.5 sec 2 sec LoginForm.RV > 5 sec 2.5 sec 1 sec
  • 32.
  • 33. TO MAKE COMPARING SENSIBLE, MAKE YOUR TESTS DETERMINISTIC Stub systems that you have no control over
  • 34. LOAD TESTING SHOULD BE OPTIONAL, THE ONLY THING THAT COUNTS IS PRODUCTION! › Your definition of done should reflect that › The aim is to get early feedback from a safe environment
  • 35. ANYTHING YOU FIND IS AN OPPORTUNITY TO FIX MORE THAN ONE PROBLEM
  • 36. SO WHAT MONITORING IS TYPICALLY NEEDED? › Be able to localise where latency is coming from! – For every system, all incoming and outgoing calls (count and time spent stats) › Finite resources (pools, CPU, I/O, etc.) › Number of active users › Response size, where possible › Add whatever you need It should be identical on all environments!
  • 37. CONCLUSION In order to put code live without pre-prod load testing, at least the following need to be in place: › Culture › State-of-the-art monitoring › Resilience Without these, support your continuous delivery process with optional load tests and strong specs. Use the load tests to identify some pain points, so you can modify the code and add monitoring, making it safer to do (incremental) dark releases and canary testing in production.
  • 38. QUESTIONS? athomas@xebia.com @a32an www.xebia.com blog.xebia.com (we’re hiring)