SlideShare ist ein Scribd-Unternehmen logo
1 von 89
How not to release software
          Laura Thomson
        laura@mozilla.com
               @lxt
Scaling Mozilla’s Websites




     Laura Thomson
   laura@mozilla.com
Writing Maintainable
        PHP
        Laura Thomson
    PHP Quebec Conference
       16th March 2007
PHP and MySQL
 Best Practices
          Luke Welling
     http://lukewelling.com

         Laura Thomson
 http://www.laurathomson.com



  OSCON 2007 © Luke Welling and Laura Thomson   8
Building the
addons.mozilla.org API
         Laura Thomson
       laura@mozilla.com
Write Beautiful Code
         Laura Thomson
    laura@laurathomson.com
PHP
Best Practices
         Laura Thomson
    laura@laurathomson.com
 http://www.laurathomson.com



    © 2007 Luke Welling and Laura Thomson   11
Rewrite or Refactor
When to declare technical bankruptcy
Laura Thomson (laura@mozilla.com)
OSCON - July 22, 2010
                                       12
Firefox
Crash
Reporting

laura@ mozilla.com
@lxt
Conference speakers and book authors are
people too. Despite our smug demeanor,
we also fail.

Frequently.
Conference speakers and book authors are
people too. Despite our smug demeanor,
we also fail.

Frequently.
This talk sparked by the release-that-shall-
not-be named.

It went so badly that the version number
was retired.

After 24 hours, people started bringing in
casseroles.
This talk sparked by the release-that-shall-
not-be named.

It went so badly that the version number
was retired.

After 24 hours, people started bringing in
casseroles.
This talk sparked by the release-that-shall-
not-be named.

It went so badly that the version number
was retired.

After 24 hours, people started bringing in
casseroles.
Thus I present this catalogue of failure, a
compendium of things not to do when you
want to release any kind of software.
Methodology
Agile fail #1


Avoid estimation by working in short sprints
- surely we can estimate the next two
weeks, right?
Right?
Agile fail #1

Cynical basis of agile: estimation is intractable
Avoid estimation by working in short sprints
- surely we can estimate the next two
weeks, right?
Right?
Agile fail #1

Cynical basis of agile: estimation is intractable
Avoid estimation by working in short sprints
- surely we can estimate the next two
weeks, right?
Right?
Agile fail #1

Cynical basis of agile: estimation is intractable
Avoid estimation by working in short sprints
- surely we can estimate the next two
weeks, right?
Right?
Sprint fail #1: rollover
We will build these five thingies/bugs/stories
in the next two week sprint
Two are more complex than anticipated
One is blocked
Ship two, roll the other three over
Rinse, repeat
Sprint fail #1: rollover
We will build these five thingies/bugs/stories
in the next two week sprint
Two are more complex than anticipated
One is blocked
Ship two, roll the other three over
Rinse, repeat
Sprint fail #1: rollover
We will build these five thingies/bugs/stories
in the next two week sprint
Two are more complex than anticipated
One is blocked
Ship two, roll the other three over
Rinse, repeat
Sprint fail #1: rollover
We will build these five thingies/bugs/stories
in the next two week sprint
Two are more complex than anticipated
One is blocked
Ship two, roll the other three over
Rinse, repeat
Sprint fail #1: rollover
We will build these five thingies/bugs/stories
in the next two week sprint
Two are more complex than anticipated
One is blocked
Ship two, roll the other three over
Rinse, repeat
Sprint fail #2:
         Mara-sprint
We can’t get these five things done. Should
we ship with only two?
If we just push the deadline out a week,
everything will be fine.
And then another.
And then another.
Sprint fail #2:
         Mara-sprint
We can’t get these five things done. Should
we ship with only two?
If we just push the deadline out a week,
everything will be fine.
And then another.
And then another.
Sprint fail #2:
         Mara-sprint
We can’t get these five things done. Should
we ship with only two?
If we just push the deadline out a week,
everything will be fine.
And then another.
And then another.
Sprint fail #2:
         Mara-sprint
We can’t get these five things done. Should
we ship with only two?
If we just push the deadline out a week,
everything will be fine.
And then another.
And then another.
Sprint fail #3:
        Death sprints
We HAVE to get these five things done
The deadline is an immovable object
We’ll just work really really late or all night
every night for the next two weeks
And then the same thing again....and
again...and again
Sprint fail #3:
        Death sprints
We HAVE to get these five things done
The deadline is an immovable object
We’ll just work really really late or all night
every night for the next two weeks
And then the same thing again....and
again...and again
Sprint fail #3:
        Death sprints
We HAVE to get these five things done
The deadline is an immovable object
We’ll just work really really late or all night
every night for the next two weeks
And then the same thing again....and
again...and again
Sprint fail #3:
        Death sprints
We HAVE to get these five things done
The deadline is an immovable object
We’ll just work really really late or all night
every night for the next two weeks
And then the same thing again....and
again...and again
Things I’ve learned
Plan the dimension of flex and choose:
     Less functionality in the same time
     Flexible timelines or slower velocity
     Continuous deployment
     Train model
Coding
Coding fail #1:
Build a new bikeshed
Sitting down to code when you don’t know
what you’re doing
                    vs
Analysis paralysis: getting stuck talking
through a complex problem, over and over
Coding fail #1:
Build a new bikeshed
Sitting down to code when you don’t know
what you’re doing
                      vs
Analysis paralysis/bikeshedding: getting stuck
talking through a complex problem, over and
over
Coding fail #2:
      One True Ring
There’s one right way to solve this problem
Coding fail #2:
      One True Ring
There’s one right way to solve this problem
Complex
Coding fail #2:
      One True Ring
There’s one right way to solve this problem
Complex
Going to take a long time
Coding fail #2:
      One True Ring
There’s one right way to solve this problem
Complex
Going to take a long time
Scope keeps expanding
Coding fail #3:
    Over-engineering
Closely related to the One True Ring
Beautiful class hierarchy and architecture
Lovely diagrams
It doesn’t do anything
Coding fail #3:
    Over-engineering
Closely related to the One True Ring
Beautiful class hierarchy and architecture
Lovely diagrams
It doesn’t do anything
Coding fail #3:
    Over-engineering
Closely related to the One True Ring
Beautiful class hierarchy and architecture
Lovely diagrams
It doesn’t do anything
Coding fail #3:
    Over-engineering
Closely related to the One True Ring
Beautiful class hierarchy and architecture
Lovely diagrams
It doesn’t do anything
Things I’ve learned
Different people work in different ways.
Some people deal better with uncertainty.
Those people should build a prototype.
Do The Simplest Thing That Could Possibly
Work
Be happy with 60-80% solutions. Iterate.
Even if you’re not a startup, Minimum Viable
Product is valid.
Testing
Test fail #1:
The Null Hypothesis


Have no tests.
Test fail #2:
     Epic confidence

Have some tests, but no idea about
coverage.
Ensures you have a great level of over-
confidence going into a release.
Test fail #2:
     Epic confidence

Have some tests, but no idea about
coverage.
Ensures you have a great level of over-
confidence going into a release.
Test fail #3:
UR DOING IT RONG
Test the wrong things.
99% unit test coverage, no load tests.
Push something out and don’t realize that it’s
too slow, causes data or user loss, until too
late.
Test fail #3:
UR DOING IT RONG
Test the wrong things.
99% unit test coverage, no load tests.
Push something out and don’t realize that it’s
too slow, causes data or user loss, until too
late.
Test fail #3:
UR DOING IT RONG
Test the wrong things.
99% unit test coverage, no load tests.
Push something out and don’t realize that it’s
too slow, causes data or user loss, until too
late.
Things I’ve learned
Know your coverage and your limitations
Use CI
Use browser testing
Use load and smoke testing
Get QA’s opinion on whether you’re ready
to ship something
Release and
deployment
Deployment fail #1:
   Version chaos
Push master/trunk/head.
Have no other branches.
Have no tags.
Have things that aren’t under version
control.
Deployment fail #1:
   Version chaos
Push master/trunk/head.
Have no other branches.
Have no tags.
Have things that aren’t under version
control.
Deployment fail #1:
   Version chaos
Push master/trunk/head.
Have no other branches.
Have no tags.
Have things that aren’t under version
control.
Deployment fail #1:
   Version chaos
Push master/trunk/head/.
Have no other branches.
Have no tags.
Have things that aren’t under version
control.
Deployment fail #2:
Fail forward, fail back

Have no rollback plan.
Have no ability to fail forward, either.
If failing forward fails, have no backup plan.
Deployment fail #2:
Fail forward, fail back

Have no rollback plan.
Have no ability to fail forward, either.
If failing forward fails, have no backup plan.
Deployment fail #2:
Fail forward, fail back

Have no rollback plan.
Have no ability to fail forward, either.
If failing forward fails, have no backup plan.
Deployment fail #3:
   Manual pushes
Release by manually logging into each
machine and updating a checkout.
This is even better under high load.
Scaling is something best not talked about.
Deployment fail #3:
   Manual pushes
Release by manually logging into each
machine and updating a checkout.
This is even better under high load.
Scaling is something best not talked about.
Deployment fail #3:
   Manual pushes
Release by manually logging into each
machine and updating a checkout.
This is even better under high load.
Scaling is something best not talked about.
Deployment fail #4:
ALTER TABLE ADD FAIL
Require manual changes to databases. Have
no record of the changes in version control.
Don’t estimate how long these changes will
take, how much they will degrade
performance, or how much downtime will be
required before kicking them off.
Deployment fail #4:
ALTER TABLE ADD FAIL
Require manual changes to databases. Have
no record of the changes in version control.
Don’t estimate how long these changes will
take, how much they will degrade
performance, or how much downtime will be
required before kicking them off.
Deployment fail #5:
  Config of Doom

Require manual changes to configuration,
especially if there are more than two
machines involved - 30 or more is especially
good.
Deployment fail #6:
Single Person Of Fail

Have only one person that knows how to
deploy your stuff.
(Don’t document any of it.)
Deployment fail #6:
Single Person Of Fail

Have only one person that knows how to
deploy your stuff.
(Don’t document any of it.)
Deployment fail #7:
 It Worked In Staging
This is the upscale version of “it worked on
my laptop”
Deployment fail #7:
 It Worked In Staging
This is the upscale version of “it worked on
my laptop”
Staging not the same as prod:
Deployment fail #7:
 It Worked In Staging
This is the upscale version of “it worked on
my laptop”
Staging not the same as prod:
Different OS, packages, hardware
Deployment fail #7:
 It Worked In Staging
This is the upscale version of “it worked on
my laptop”
Staging not the same as prod:
Different OS, packages, hardware
Only one of something where there are >1
in prod (classic examples: db replication,
memcache)
Things I’ve learned
Have a release management system
Build the capability to deploy continuously
Automate everything: build, test, deploy
Use configuration management
Document everything
Load test when needed
Eliminate SPOFs, including people
Staging should reflect production
Operations
Ops fail #1:
         Devops Fail
Ops and Dev don’t speak to, work with, or
trust each other


Most communication takes place through
aggressive ticket filing and email
Ops fail #1:
         Devops Fail
Ops and Dev don’t speak to, work with, or
trust each other


Most communication takes place through
aggressive ticket filing and email
Ops fail #2:
      Dilettante devs
Stuff breaks at 2am
Devs have not documented anything and
aren’t available to fix it
Ops gets the blame for extended downtime
Ops fail #2:
      Dilettante devs
Stuff breaks at 2am
Devs have not documented anything and
aren’t available to fix it
Ops gets the blame for extended downtime
Ops fail #2:
      Dilettante devs
Stuff breaks at 2am
Devs have not documented anything and
aren’t available to fix it
Ops gets the blame for extended downtime
Ops fail #3:
 Telepathic troubleshooting

Stuff breaks
Devs have to troubleshoot via remote:
Did you look at this log, what did it say?
What is this config value set to?
Ops fail #3:
 Telepathic troubleshooting

Stuff breaks
Devs have to troubleshoot via remote:
Did you look at this log, what did it say?
What is this config value set to?
Things I’ve learned
Everybody who works on a system is
responsible for it
Cross train your teams
Open up systems to devs for read access to
config + logging
Troubleshoot together
Questions?
laura@mozilla.com

Weitere ähnliche Inhalte

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

How Not To Release Software (Open Source Bridge 2012)

  • 1. How not to release software Laura Thomson laura@mozilla.com @lxt
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. Scaling Mozilla’s Websites Laura Thomson laura@mozilla.com
  • 7. Writing Maintainable PHP Laura Thomson PHP Quebec Conference 16th March 2007
  • 8. PHP and MySQL Best Practices Luke Welling http://lukewelling.com Laura Thomson http://www.laurathomson.com OSCON 2007 © Luke Welling and Laura Thomson 8
  • 9. Building the addons.mozilla.org API Laura Thomson laura@mozilla.com
  • 10. Write Beautiful Code Laura Thomson laura@laurathomson.com
  • 11. PHP Best Practices Laura Thomson laura@laurathomson.com http://www.laurathomson.com © 2007 Luke Welling and Laura Thomson 11
  • 12. Rewrite or Refactor When to declare technical bankruptcy Laura Thomson (laura@mozilla.com) OSCON - July 22, 2010 12
  • 14. Conference speakers and book authors are people too. Despite our smug demeanor, we also fail. Frequently.
  • 15. Conference speakers and book authors are people too. Despite our smug demeanor, we also fail. Frequently.
  • 16. This talk sparked by the release-that-shall- not-be named. It went so badly that the version number was retired. After 24 hours, people started bringing in casseroles.
  • 17. This talk sparked by the release-that-shall- not-be named. It went so badly that the version number was retired. After 24 hours, people started bringing in casseroles.
  • 18. This talk sparked by the release-that-shall- not-be named. It went so badly that the version number was retired. After 24 hours, people started bringing in casseroles.
  • 19. Thus I present this catalogue of failure, a compendium of things not to do when you want to release any kind of software.
  • 21. Agile fail #1 Avoid estimation by working in short sprints - surely we can estimate the next two weeks, right? Right?
  • 22. Agile fail #1 Cynical basis of agile: estimation is intractable Avoid estimation by working in short sprints - surely we can estimate the next two weeks, right? Right?
  • 23. Agile fail #1 Cynical basis of agile: estimation is intractable Avoid estimation by working in short sprints - surely we can estimate the next two weeks, right? Right?
  • 24. Agile fail #1 Cynical basis of agile: estimation is intractable Avoid estimation by working in short sprints - surely we can estimate the next two weeks, right? Right?
  • 25. Sprint fail #1: rollover We will build these five thingies/bugs/stories in the next two week sprint Two are more complex than anticipated One is blocked Ship two, roll the other three over Rinse, repeat
  • 26. Sprint fail #1: rollover We will build these five thingies/bugs/stories in the next two week sprint Two are more complex than anticipated One is blocked Ship two, roll the other three over Rinse, repeat
  • 27. Sprint fail #1: rollover We will build these five thingies/bugs/stories in the next two week sprint Two are more complex than anticipated One is blocked Ship two, roll the other three over Rinse, repeat
  • 28. Sprint fail #1: rollover We will build these five thingies/bugs/stories in the next two week sprint Two are more complex than anticipated One is blocked Ship two, roll the other three over Rinse, repeat
  • 29. Sprint fail #1: rollover We will build these five thingies/bugs/stories in the next two week sprint Two are more complex than anticipated One is blocked Ship two, roll the other three over Rinse, repeat
  • 30. Sprint fail #2: Mara-sprint We can’t get these five things done. Should we ship with only two? If we just push the deadline out a week, everything will be fine. And then another. And then another.
  • 31. Sprint fail #2: Mara-sprint We can’t get these five things done. Should we ship with only two? If we just push the deadline out a week, everything will be fine. And then another. And then another.
  • 32. Sprint fail #2: Mara-sprint We can’t get these five things done. Should we ship with only two? If we just push the deadline out a week, everything will be fine. And then another. And then another.
  • 33. Sprint fail #2: Mara-sprint We can’t get these five things done. Should we ship with only two? If we just push the deadline out a week, everything will be fine. And then another. And then another.
  • 34. Sprint fail #3: Death sprints We HAVE to get these five things done The deadline is an immovable object We’ll just work really really late or all night every night for the next two weeks And then the same thing again....and again...and again
  • 35. Sprint fail #3: Death sprints We HAVE to get these five things done The deadline is an immovable object We’ll just work really really late or all night every night for the next two weeks And then the same thing again....and again...and again
  • 36. Sprint fail #3: Death sprints We HAVE to get these five things done The deadline is an immovable object We’ll just work really really late or all night every night for the next two weeks And then the same thing again....and again...and again
  • 37. Sprint fail #3: Death sprints We HAVE to get these five things done The deadline is an immovable object We’ll just work really really late or all night every night for the next two weeks And then the same thing again....and again...and again
  • 38. Things I’ve learned Plan the dimension of flex and choose: Less functionality in the same time Flexible timelines or slower velocity Continuous deployment Train model
  • 40. Coding fail #1: Build a new bikeshed Sitting down to code when you don’t know what you’re doing vs Analysis paralysis: getting stuck talking through a complex problem, over and over
  • 41. Coding fail #1: Build a new bikeshed Sitting down to code when you don’t know what you’re doing vs Analysis paralysis/bikeshedding: getting stuck talking through a complex problem, over and over
  • 42. Coding fail #2: One True Ring There’s one right way to solve this problem
  • 43. Coding fail #2: One True Ring There’s one right way to solve this problem Complex
  • 44. Coding fail #2: One True Ring There’s one right way to solve this problem Complex Going to take a long time
  • 45. Coding fail #2: One True Ring There’s one right way to solve this problem Complex Going to take a long time Scope keeps expanding
  • 46. Coding fail #3: Over-engineering Closely related to the One True Ring Beautiful class hierarchy and architecture Lovely diagrams It doesn’t do anything
  • 47. Coding fail #3: Over-engineering Closely related to the One True Ring Beautiful class hierarchy and architecture Lovely diagrams It doesn’t do anything
  • 48. Coding fail #3: Over-engineering Closely related to the One True Ring Beautiful class hierarchy and architecture Lovely diagrams It doesn’t do anything
  • 49. Coding fail #3: Over-engineering Closely related to the One True Ring Beautiful class hierarchy and architecture Lovely diagrams It doesn’t do anything
  • 50. Things I’ve learned Different people work in different ways. Some people deal better with uncertainty. Those people should build a prototype. Do The Simplest Thing That Could Possibly Work Be happy with 60-80% solutions. Iterate. Even if you’re not a startup, Minimum Viable Product is valid.
  • 52. Test fail #1: The Null Hypothesis Have no tests.
  • 53. Test fail #2: Epic confidence Have some tests, but no idea about coverage. Ensures you have a great level of over- confidence going into a release.
  • 54. Test fail #2: Epic confidence Have some tests, but no idea about coverage. Ensures you have a great level of over- confidence going into a release.
  • 55. Test fail #3: UR DOING IT RONG Test the wrong things. 99% unit test coverage, no load tests. Push something out and don’t realize that it’s too slow, causes data or user loss, until too late.
  • 56. Test fail #3: UR DOING IT RONG Test the wrong things. 99% unit test coverage, no load tests. Push something out and don’t realize that it’s too slow, causes data or user loss, until too late.
  • 57. Test fail #3: UR DOING IT RONG Test the wrong things. 99% unit test coverage, no load tests. Push something out and don’t realize that it’s too slow, causes data or user loss, until too late.
  • 58. Things I’ve learned Know your coverage and your limitations Use CI Use browser testing Use load and smoke testing Get QA’s opinion on whether you’re ready to ship something
  • 60. Deployment fail #1: Version chaos Push master/trunk/head. Have no other branches. Have no tags. Have things that aren’t under version control.
  • 61. Deployment fail #1: Version chaos Push master/trunk/head. Have no other branches. Have no tags. Have things that aren’t under version control.
  • 62. Deployment fail #1: Version chaos Push master/trunk/head. Have no other branches. Have no tags. Have things that aren’t under version control.
  • 63. Deployment fail #1: Version chaos Push master/trunk/head/. Have no other branches. Have no tags. Have things that aren’t under version control.
  • 64. Deployment fail #2: Fail forward, fail back Have no rollback plan. Have no ability to fail forward, either. If failing forward fails, have no backup plan.
  • 65. Deployment fail #2: Fail forward, fail back Have no rollback plan. Have no ability to fail forward, either. If failing forward fails, have no backup plan.
  • 66. Deployment fail #2: Fail forward, fail back Have no rollback plan. Have no ability to fail forward, either. If failing forward fails, have no backup plan.
  • 67. Deployment fail #3: Manual pushes Release by manually logging into each machine and updating a checkout. This is even better under high load. Scaling is something best not talked about.
  • 68. Deployment fail #3: Manual pushes Release by manually logging into each machine and updating a checkout. This is even better under high load. Scaling is something best not talked about.
  • 69. Deployment fail #3: Manual pushes Release by manually logging into each machine and updating a checkout. This is even better under high load. Scaling is something best not talked about.
  • 70. Deployment fail #4: ALTER TABLE ADD FAIL Require manual changes to databases. Have no record of the changes in version control. Don’t estimate how long these changes will take, how much they will degrade performance, or how much downtime will be required before kicking them off.
  • 71. Deployment fail #4: ALTER TABLE ADD FAIL Require manual changes to databases. Have no record of the changes in version control. Don’t estimate how long these changes will take, how much they will degrade performance, or how much downtime will be required before kicking them off.
  • 72. Deployment fail #5: Config of Doom Require manual changes to configuration, especially if there are more than two machines involved - 30 or more is especially good.
  • 73. Deployment fail #6: Single Person Of Fail Have only one person that knows how to deploy your stuff. (Don’t document any of it.)
  • 74. Deployment fail #6: Single Person Of Fail Have only one person that knows how to deploy your stuff. (Don’t document any of it.)
  • 75. Deployment fail #7: It Worked In Staging This is the upscale version of “it worked on my laptop”
  • 76. Deployment fail #7: It Worked In Staging This is the upscale version of “it worked on my laptop” Staging not the same as prod:
  • 77. Deployment fail #7: It Worked In Staging This is the upscale version of “it worked on my laptop” Staging not the same as prod: Different OS, packages, hardware
  • 78. Deployment fail #7: It Worked In Staging This is the upscale version of “it worked on my laptop” Staging not the same as prod: Different OS, packages, hardware Only one of something where there are >1 in prod (classic examples: db replication, memcache)
  • 79. Things I’ve learned Have a release management system Build the capability to deploy continuously Automate everything: build, test, deploy Use configuration management Document everything Load test when needed Eliminate SPOFs, including people Staging should reflect production
  • 81. Ops fail #1: Devops Fail Ops and Dev don’t speak to, work with, or trust each other Most communication takes place through aggressive ticket filing and email
  • 82. Ops fail #1: Devops Fail Ops and Dev don’t speak to, work with, or trust each other Most communication takes place through aggressive ticket filing and email
  • 83. Ops fail #2: Dilettante devs Stuff breaks at 2am Devs have not documented anything and aren’t available to fix it Ops gets the blame for extended downtime
  • 84. Ops fail #2: Dilettante devs Stuff breaks at 2am Devs have not documented anything and aren’t available to fix it Ops gets the blame for extended downtime
  • 85. Ops fail #2: Dilettante devs Stuff breaks at 2am Devs have not documented anything and aren’t available to fix it Ops gets the blame for extended downtime
  • 86. Ops fail #3: Telepathic troubleshooting Stuff breaks Devs have to troubleshoot via remote: Did you look at this log, what did it say? What is this config value set to?
  • 87. Ops fail #3: Telepathic troubleshooting Stuff breaks Devs have to troubleshoot via remote: Did you look at this log, what did it say? What is this config value set to?
  • 88. Things I’ve learned Everybody who works on a system is responsible for it Cross train your teams Open up systems to devs for read access to config + logging Troubleshoot together

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n