SlideShare wird heruntergeladen. ×
0
10 deploys per day
Dev & ops cooperation at Flickr


John Allspaw & Paul Hammond
         Velocity 2009
3 billion photos

               40,000 photos per second




                     http://flickr.com/photos/jimmyroq/415506...
Dev versus Ops
“It’s not my machines,
     it’s your code!”
“It’s not my code,
it’s your machines!”
Spock Scotty
        Little bit weird   Pulls levers & turns knobs
Sits closer to the boss    Easily excited
       Thinks...
Says “No” all the time
Afraid that new fangled things will break the site
                  Fingerpointy
Ops stereotype

            Because the site breaks
                unexpectedly


                                      B...
http://www.flickr.com/photos/stewart/461099066/




Traditional thinking

     Dev’s job is to add new features
Ops’ job is...
Ops’ job is NOT to keep the site stable and fast
Ops’ job is to enable the business
           (this is dev’s job too)
The business requires change
But change is the root cause of most outages!
Discourage change in the interests of stability
                    or
Allow change to happen as often as it needs to
Lowering risk of change
through tools and culture
Dev and Ops
Ops who think like devs
Devs who think like ops
“But that’s me!”
You can always think more like them
Tools
1. Automated infrastructure
       If there is only one thing you do…
CFengine
Chef
       BCfg2                                  FAI
1. Automated infrastructure
         If there is only one ...
Role &
configuration
management

OS imaging
2. Shared version control
Everyone knows where to look
           http://www.flickr.com/photos/thunderchild5/1330744559/
3. One step build
3. One step build
   and deploy
[2009-06-22 16:03:57] [harmes] site deployed (changes...)




        Who? When? What?
Small frequent changes
              http://www.flickr.com/photos/mauren/2429240906/
4. Feature flags
(aka branching in code)
1.0.1         1.0.2



1.0   1.1           1.2


                          1.1.1


Desktop software
r2301   r2302   r2306




   Web software
http://www.flickr.com/photos/8720628@N04/2188922076/




Always ship trunk
Everyone knows exactly where to look
              http://www.flickr.com/photos/thunderchild5/1330744559/
Feature flags

#php
if ($cfg['enable_feature_video']){

 
 …
}

{* smarty *}
{if $cfg.enable_feature_beehive}

 
 …
{/if}
http://www.flickr.com/photos/healthserviceglasses/3522809727/




                     Private betas
Bucket testing




http://www.flickr.com/photos/davidw/2063575447/
http://www.flickr.com/photos/jking89/3031204314/




         Dark launches
Free
contingency
switches
              http://www.flickr.com/photos/flattop341/260207875/
5. Shared metrics
Application level metrics
Application level metrics
Adaptive feedback loops




         RU ok?
App                System Metrics
          maybe?
6. IRC and IM robots
Dev, Ops, and Robots
              Having a conversation

      build
                       deploy
      logs
           ...
Culture
1. Respect
If there is only one thing you do…
Don’t
 stereotype
 (not all developers are lazy)



http://www.flickr.com/photos/aaronjacobs/64368770/
http://www.flickr.com/photos/chrisdag/2286198568/




         Respect other people’s expertise,
             opinions and ...
http://www.flickr.com/photos/jwheare/2580631103/




 Don’t just say “No”
http://www.flickr.com/photos/alancleaver/2661424637/



                                                Don’t hide things
Developers: Talk to ops about the impact of your code:

• what metrics will change, and how?
• what are the risks?
• what ...
2. Trust
Ops needs to trust dev to involve
them on feature discussions

Dev needs to trust ops to discuss
infrastructure changes

E...
http://www.flickr.com/photos/flattop341/224176602/




   Shared runbooks & escalation plans
http://www.flickr.com/photos/telstar/2861103147/




  Provide knobs and levers
http://www.flickr.com/photos/williamhook/3468484351/




  Ops: Be transparent,
  give devs access to systems
3. Healthy attitude
   about failure
http://www.flickr.com/photos/pinksherbet/447190603/




Failure will happen
If you think you can prevent failure then
         you aren’t developing your ability to respond




http://www.flickr.com/...
http://www.flickr.com/photos/changereality/2349538868/
Fire drills



http://www.flickr.com/photos/dnorman/2678090600
4. Avoiding Blame
No fingerpointing




http://www.flickr.com/photos/rocketjim54/2955889085/
Fingerpointyness

problem!!!
 argggh!                                                    fixed.



     freaking out, blami...
Being productive

problem!!!
 argggh!                  fixed.



      figuring it   fixing things   feeling move
         ou...
Developers: Remember that someone else will
  probably get woken up when your code breaks




http://www.flickr.com/photos/...
http://www.flickr.com/photos/allspaw/2819774755/




  Ops: provide
  constructive
  feedback on
  current aches
  and pains
1. Automated infrastructure
2. Shared version control
3. One step build and deploy
4. Feature flags
5. Shared metrics
6. IR...
This is not easy
You could just carry on shouting at each other…
(Thank you)
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
Nächste SlideShare
Wird geladen in ...5
×

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

267,030

Published on

Communications and cooperation between development and operations isn't optional, it's mandatory. Flickr takes the idea of "release early, release often" to an extreme - on a normal day there are 10 full deployments of the site to our servers. This session discusses why this rate of change works so well, and the culture and technology needed to make it possible.

Published in: Technologie
13 Kommentare
608 Gefällt mir
Statistiken
Notizen
  • Great presentation ... I was wondering if Flickr uses staging or code review ? And what types of testing such as integration testing, functional testing, unit testing, or A/B testing, are used ?
       Antworten 
    Sind Sie sicher, dass Sie...  Ja  Nein
    Ihre Nachricht erscheint hier
  • This is hilarious, you say no finger-pointing, but there is a clear message in this presentation that 'Ops' are the problem, with only a small nod to how developers can help. The general theme is that Ops are the ones causing problems with several slides focussing on this with no equivalent developers slide.
       Antworten 
    Sind Sie sicher, dass Sie...  Ja  Nein
    Ihre Nachricht erscheint hier
  • http://blip.tv/oreilly-velocity-conference/velocity-09-john-allspaw-10-deploys-per-day-dev-and-ops-cooperation-at-flickr-2297883
       Antworten 
    Sind Sie sicher, dass Sie...  Ja  Nein
    Ihre Nachricht erscheint hier
  • infra
       Antworten 
    Sind Sie sicher, dass Sie...  Ja  Nein
    Ihre Nachricht erscheint hier
  • Well made presentation, good work!
    http://www.fungiftideas.org/
    http://www.fungiftideas.org/category/wedding-gift-ideas/
       Antworten 
    Sind Sie sicher, dass Sie...  Ja  Nein
    Ihre Nachricht erscheint hier
Keine Downloads
Views
Gesamtviews
267,030
Bei Slideshare
0
Aus Einbettungen
0
Anzahl an Einbettungen
136
Aktionen
Geteilt
0
Downloads
3,432
Kommentare
13
Gefällt mir
608
Einbettungen 0
No embeds

No notes for slide

Transcript of "10+ Deploys Per Day: Dev and Ops Cooperation at Flickr"

  1. 1. 10 deploys per day Dev & ops cooperation at Flickr John Allspaw & Paul Hammond Velocity 2009
  2. 2. 3 billion photos 40,000 photos per second http://flickr.com/photos/jimmyroq/415506736/
  3. 3. Dev versus Ops
  4. 4. “It’s not my machines, it’s your code!”
  5. 5. “It’s not my code, it’s your machines!”
  6. 6. Spock Scotty Little bit weird Pulls levers & turns knobs Sits closer to the boss Easily excited Thinks too hard Yells a lot in emergencies
  7. 7. Says “No” all the time Afraid that new fangled things will break the site Fingerpointy
  8. 8. Ops stereotype Because the site breaks unexpectedly Because no one tells them anything Because They say “NO” all the time
  9. 9. http://www.flickr.com/photos/stewart/461099066/ Traditional thinking Dev’s job is to add new features Ops’ job is to keep the site stable and fast
  10. 10. Ops’ job is NOT to keep the site stable and fast
  11. 11. Ops’ job is to enable the business (this is dev’s job too)
  12. 12. The business requires change
  13. 13. But change is the root cause of most outages!
  14. 14. Discourage change in the interests of stability or Allow change to happen as often as it needs to
  15. 15. Lowering risk of change through tools and culture
  16. 16. Dev and Ops
  17. 17. Ops who think like devs Devs who think like ops
  18. 18. “But that’s me!”
  19. 19. You can always think more like them
  20. 20. Tools
  21. 21. 1. Automated infrastructure If there is only one thing you do…
  22. 22. CFengine Chef BCfg2 FAI 1. Automated infrastructure If there is only one thing you do… System Imager Puppet Cobbler
  23. 23. Role & configuration management OS imaging
  24. 24. 2. Shared version control
  25. 25. Everyone knows where to look http://www.flickr.com/photos/thunderchild5/1330744559/
  26. 26. 3. One step build
  27. 27. 3. One step build and deploy
  28. 28. [2009-06-22 16:03:57] [harmes] site deployed (changes...) Who? When? What?
  29. 29. Small frequent changes http://www.flickr.com/photos/mauren/2429240906/
  30. 30. 4. Feature flags (aka branching in code)
  31. 31. 1.0.1 1.0.2 1.0 1.1 1.2 1.1.1 Desktop software
  32. 32. r2301 r2302 r2306 Web software
  33. 33. http://www.flickr.com/photos/8720628@N04/2188922076/ Always ship trunk
  34. 34. Everyone knows exactly where to look http://www.flickr.com/photos/thunderchild5/1330744559/
  35. 35. Feature flags #php if ($cfg['enable_feature_video']){ … } {* smarty *} {if $cfg.enable_feature_beehive} … {/if}
  36. 36. http://www.flickr.com/photos/healthserviceglasses/3522809727/ Private betas
  37. 37. Bucket testing http://www.flickr.com/photos/davidw/2063575447/
  38. 38. http://www.flickr.com/photos/jking89/3031204314/ Dark launches
  39. 39. Free contingency switches http://www.flickr.com/photos/flattop341/260207875/
  40. 40. 5. Shared metrics
  41. 41. Application level metrics
  42. 42. Application level metrics
  43. 43. Adaptive feedback loops RU ok? App System Metrics maybe?
  44. 44. 6. IRC and IM robots
  45. 45. Dev, Ops, and Robots Having a conversation build deploy logs logs alerts monitors IRC search engine
  46. 46. Culture
  47. 47. 1. Respect If there is only one thing you do…
  48. 48. Don’t stereotype (not all developers are lazy) http://www.flickr.com/photos/aaronjacobs/64368770/
  49. 49. http://www.flickr.com/photos/chrisdag/2286198568/ Respect other people’s expertise, opinions and responsibilities
  50. 50. http://www.flickr.com/photos/jwheare/2580631103/ Don’t just say “No”
  51. 51. http://www.flickr.com/photos/alancleaver/2661424637/ Don’t hide things
  52. 52. Developers: Talk to ops about the impact of your code: • what metrics will change, and how? • what are the risks? • what are the signs that something is going wrong? • what are the contingencies? This means you need to work this out before talking to ops
  53. 53. 2. Trust
  54. 54. Ops needs to trust dev to involve them on feature discussions Dev needs to trust ops to discuss infrastructure changes Everyone needs to trust that everyone else is doing their best for the business http://www.flickr.com/photos/85128884@N00/2650981813/
  55. 55. http://www.flickr.com/photos/flattop341/224176602/ Shared runbooks & escalation plans
  56. 56. http://www.flickr.com/photos/telstar/2861103147/ Provide knobs and levers
  57. 57. http://www.flickr.com/photos/williamhook/3468484351/ Ops: Be transparent, give devs access to systems
  58. 58. 3. Healthy attitude about failure
  59. 59. http://www.flickr.com/photos/pinksherbet/447190603/ Failure will happen
  60. 60. If you think you can prevent failure then you aren’t developing your ability to respond http://www.flickr.com/photos/toms/2323779363/
  61. 61. http://www.flickr.com/photos/changereality/2349538868/
  62. 62. Fire drills http://www.flickr.com/photos/dnorman/2678090600
  63. 63. 4. Avoiding Blame
  64. 64. No fingerpointing http://www.flickr.com/photos/rocketjim54/2955889085/
  65. 65. Fingerpointyness problem!!! argggh! fixed. freaking out, blaming, whining, figuring it fixing things not talking, covering hiding. out finding fault ass hurt egos time
  66. 66. Being productive problem!!! argggh! fixed. figuring it fixing things feeling move out guilty on with life time
  67. 67. Developers: Remember that someone else will probably get woken up when your code breaks http://www.flickr.com/photos/alex-s/353218851/
  68. 68. http://www.flickr.com/photos/allspaw/2819774755/ Ops: provide constructive feedback on current aches and pains
  69. 69. 1. Automated infrastructure 2. Shared version control 3. One step build and deploy 4. Feature flags 5. Shared metrics 6. IRC and IM robots 1. Respect 2. Trust 3. Healthy attitude about failure 4. Avoiding Blame
  70. 70. This is not easy You could just carry on shouting at each other…
  71. 71. (Thank you)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×