Leon Fayer discusses how monitoring is a key part of DevOps and improving systems. He outlines what aspects systems, applications, databases, and business processes should be monitored. Fayer then provides a case study example of how one company traced a revenue issue to a problem with their email system by taking a top-down approach and correlating multiple metrics from traffic down through load time, databases, and email. Fayer emphasizes understanding the business and correlating different types of data to effectively monitor complex systems.
2. Who am I ?
• 20+ years of development and operations of large systems
• currently Vice President at OmniTI
• can be found online:
• @papa_fire
• http://fayerplay.com
• github:lfayer
13. What to monitor specifically?
• systems
• databases
• application
• integration points
• performance
• user behavior
• business processes
14. Perfect quote
“ I don’t give a **** if the
datacenter is on fire as long as
I am still making money ”
- CEO
15. Example: Twitter
serves over 20 million unique visitors a day
… legendary for downtime
. servers are up and running
. HTTP checks return 200
. tweets lost
16. Why monitor?
• software is never perfect
• systems are more and more complex
• proactive is better than reactive
• external dependency worry
• …