With more than 3.2 million customers and a vastly complex tech landscape, Virgin Money's IT team faces huge pressure to provide the ultimate digital banking experience. In this candid Q&A session, Andy Lofthouse will dive into the company's journey from alert storm and countless hours of problem hunting, to rapid release cycles and precise digital experience insights, which has saved the company inordinate amounts of time and money.

  3. 3. The Team Islam Noor IT Operations Team Leader Andrew Lofthouse Senior IT Analyst
  6. 6. Time poor Complex environment Reactive to issues Blind spots Alert storms Today’s challenges 14/02/2018 6
  7. 7. confidential Dynatrace Journey – From Synthetic to Full Stack
  8. 8. confidential • Smartscape - vertical and horizontal topology. • Understand which services, hosts or processes are talking to each other • Understand the services, processes and hosts are providing the application, directly or indirectly • No configuration and easy deployment! • Nodes highlight red if in a current problem • Quick drill down to the desired component Why Dynatrace – first impressions
  9. 9. confidential Quick time to value Problem • Not repeatable in Test and cannot be troubleshooted with current tooling • After months of investigation and customers being impacted, the root-cause of the issue cannot be found Impact • Issue causes severe slow downs for the users and timeouts, eventually needing a manual failover to the DR site • Operations team mislead by current alerting on their investigation path Consequences • Poor customer experience drive poor conversion rates Recurring issue for months 479 hours lost in War-room up to today. 6 Virgin Money teams and one 3rd party were involved Happening more frequently Has cost so far £23,950 Brand reputation impacted by bad tweets
  10. 10. First 2 weeks - Incidents & Alerting Foglight Alerts - 128 • 61% of them were false alerts • 39% of them were genuine issues. • Out of that 39%, half of them were duplicate alerts • Only 26 were real after duplicates/false etc taken out Dynatrace Problem Resolution - 100 • 42% said problem resolved. • Leaving 58% which were genuine • 100% accurate! Noise caused by poor alerting + poor troubleshooting + no Rootcause analysis = 479 hours of investigation.
  11. 11. First value we saw • Database CPU everynight between 8 and 9pm • Peak login times • Couldn’t see this issue prior Response time slow down
  13. 13. What’s next for Virgin Money 14/02/2018 17