The document discusses strategies for improving the experience of engineers who are on-call to address technical issues. It suggests focusing on ensuring alerts are meaningful by only sending ones where action can be taken, and building support structures like runbooks to document procedures and enable calling for backup. It also advocates compensating engineers for extra work, conducting blameless post-mortems to learn from issues, and shifting culture away from a hero mentality to emphasize teamwork. The overall goal is to help prevent burnout among on-call staff.
19. Aaron Aldrich - @CrayZeigh
IT SOUNDS
PLAUSIBLE ENOUGH
TONIGHT, BUT WAIT
UNTIL TOMORROW.
WAIT FOR THE
COMMON SENSE OF
THE MORNING.
HG Wells, The Time Machine
38. Aaron Aldrich - @CrayZeigh
SOFTWARE
DEVELOPMENT, IT TURNS
OUT, IS A TEAM SPORTâŠ
AND WHATâS WORSE,
ENCOURAGING THE HERO
MENTALITY LEADS TO
CORROSIVE DYSFUNCTION
IN SOFTWARE TEAMS.
Rob Mee, Pivotal Labs
41. Aaron Aldrich @CrayZeighâš
@CageData
WHAT IS A BLAMELESS POSTMORTEM?
âž Team members are accountable but not responsible
âž Complete Transparency
âž Deeper look at circumstances
âž What happened and how to improve it (speciïŹc details)
âž Real conditions of failure in complex systems
@jasonhand
http://www.slideshare.net/jhand2/its-not-your-fault-blameless-post-mortems