Standard Bank's goal is to be the leading financial services organisation in, for, and across Africa. They’re investing heavily in their omni-channel capability delivered through self-service channels that sit on a modern software stack. Their strategy requires that they frequently deliver new functionality to customers across countries, channels, and business domains. To support this, Standard Bank has moved to modern engineering practices, including DevOps, test-driven development, and Agile. In this session, Andrew and Lenro from Standard Bank will outline their journey and discuss how APM has helped their teams, promoted a DevOps culture and driven transformation. They’ll cover:
-How we structure our development teams and what DevOps means to us
-How APM unblocks delivery and help deliver quality software to our customers
-APM as a DevOps enabler, helping bridge the gap between operations and development
To learn more, visit: www.appdynamics.com
3. Our vision is to be the leading financial services organisation
in, for and across Africa, delivering exceptional client
experiences and superior value
Vision & Strategy
9. Build Pipelines – MVP2
FREQUENCY QUALITY RELIABILITY SECURITY
Code into trunk
Deploys into test
Days since last
prod deploy
Lint
Complexity
Customer
feedback
Failed deploys %
Up-time (% 500 errors)
Coverage
Health checks
TESTING
Checkmarx
SSL Scans
15. The Status Quo
If you had an outage then…
1. Invited to the crisis room
2. No access to production
3. No access to the tools being used
4. Tools turned off because it impacts production
5. Yet you need to be able to tell people what is wrong
6. Changes being backed out
7. No clear root cause
!!
16. The Fallout
The Challenges
• 27 February 2015 Meeting Group CIO to explain what happened
• No RCA
• Devs treated as Second Class Citizens when it comes to prod!
• We need to do something, doing nothing and leaving things as they are
were not an option…
17. The Journey to a Solution
Discovery
• How did we monitor applications in the past?
• There is monitoring, but it is not available to the community…
• We tried to use existing tools…
• Why APM?
• Looked at alternatives
• Convince the right people
18. The Journey to a Solution…
7 Key Requirements
1. Always running in production
2. Easy to deploy and use
3. “I do not want to call the vendor!”
4. DevOps enabler
5. For everybody to use
6. Code drill-down capability
7. Auto discovery
AppDynamics - Standard Bank’s APM Solution
19. How can we remove
impediments and deliver quality
software to our customers?
20. •
Rest of Africa
Nigeria – Internet Banking
• Performance issues
• Login takes 30 Seconds
• Slow responses in services layer
• Developers spend 3 – 4 hours per day looking for/in logs
• It wasn't me…
• Login issue traced to core banking system
• Week later patch received - Logins reduced to less than 3 sec
• Developers build features instead of hunting logs and bugs
21. •
South Africa
USSD - 2016
• Project ran for more than 2 years
• Complex stack – so many layers…
• Performance issues everywhere
• Pinpoint where the performance issues and errors are
• Metrics to help drive decisions
• Finally in production 26 October 2016
22. You can prevent outages and reduce
the time to fix production issues…
23. •MTBF – We can prevent bad customer experience
South Africa – Mobile Banking
• Alert was triggered that the error rate was higher than usual
• Investigation found one node in the cluster not working
• 25% of our customers were experiencing timeouts
24. •MTTR – We can solve problems…quicker
• April 2016 outage is reported on Internet Banking
• 5 minutes into the outage we could pinpoint the problem
• Issue in our Adaptive Risk system
• Issue isolated and service was restored
South Africa – Internet Banking
• No need for a crisis meeting (How boring…)
• Responsible team dealt the issue
• Service was restored quickly
25. •MTTR – We can solve problems… quicker
• Alert triggered due to slow responses for transaction logging
• Not impacting customer experience yet…
• Connection pools started to fill up
• Customers experiencing slow performance
South Africa – Mobile Banking
• Pinpoint that the issue was on a message queue
• We could pro-actively fix the issues and restart the broker
• Response times improved and service returned to normal
27. • Visibility on errors and exceptions
• Metrics on how code is performing
• I did not know the code was doing that…in production
• Pin point where the issues are
• Better quality code going into production (Engineering practices)
• Alerts to when things go wrong
• Have visibility without having to logon to the server
• You can even monitor certificate expiry
Dev and Ops in a feature team is possible
30. What does it mean to Standard Bank?
DevOps
• Resilience in our teams, applications and infrastructure
• Visibility to our software and infrastructure
• You build it, you own it
• Automate everything
• Culture
• Enabler to get solutions to our customers quicker and frequently
DevOps = Development + Operations…
32. Automate
Automate as much you can
• Recipes to install Controller & Agents
• But why?
• Repeatable & consistent
• Machines are better at repetitive things
• People can focus on value add work
33. The right people
AppDynamics in the right hands…
• Allow dev and ops teams to deploy the agents
• Do not centralise the deployment of the agents
• Dev and ops teams know their applications best
• Find people that care about their applications
34. Not everything comes for free…
Out of the box good enough?
• You need to manage your licences
• Put some thought into how you structure your apps
• Business transaction limits
35. Define minimum criteria
Checklist
• Name business transactions you want to monitor
• Name remote services to identify back ends
• Define health rules
• Setup alerts
• Dashboard per feature team
36. DevOps & AppDynamics in a Complex Banking Environment.
Questions?
“They call it Africa,
we call it home”
Hinweis der Redaktion
-Head up digital channels software engineering
-Recently asked to drive software engineering across IT (not the admin side)
-Geographically distributed.
-Core Banking transformation – with many different deployment scenarios.
-Solutions need to be flexible enough to deal with variety and a state of transition
Digitisation strategy across the board
Staff SS, Staff Assist, Self Service
Deep stack not just channels
Omni channel strategy favoring Self Service
So how do you meet the challenge – One key objective was to become obsessed with Software Engineering
Background on Tablet app – clueless we were
Building 20 or the Magical Incubator
Erected during WW2 which lasted 55 years.
9 Nobel Prize Winners worked in this building
Many significant innovations came out of this building – radar WW2 and Bose
- Blue Green Deployments came out of one of our feature teams
- Allows for safe intraday deployments (Beta programs and roll back)
- Adopted Agile and later moved into DevOps
- 31k per FP to 8.5k per FP and significantly faster. (Governance aside)
- Multi Geography
Multi Domain
Federate ownership of the Tile to the respective Feature Team
We also have Companion Apps if justified (Kids banking, OST) where some services are shared
Leverage new features:
Cross Boarder Payments – easy on SS not SA
Home loans calculators
Beneficiary Management
Ties into the concept of feature teams who know the domain
Structuring Feature Teams is hard – historically around system not domain (makes ownership of technical debt and migration off monolith harder)
Make use of Elastic Search for many things:
Example where we pulled our Stash repo into Elastic Search
Easily see which teams are the most active
Automate anything that gave us grief
Applying Engineering practices to all build pipelines
Things we want to measure for now (Remedy, App Stores)
Metrics will be gatherer from our tools (Atlassian API, AppDynamics)
The concept of the build pipeline is commit and automated process kicks off
No fingers in the pie
All governance baked into the pipeline
Test data created automatically
Web Services off Mainframe – challenge the status quo
Automate anything that gave us grief
Applying Engineering practices to all build pipelines
Things we want to measure for now (Remedy, App Stores)
Metrics will be gatherer from our tools (Atlassian API, AppDynamics)
The concept of the build pipeline is commit and automated process kicks off
No fingers in the pie
All governance baked into the pipeline
Test data created automatically
Web Services off Mainframe – challenge the status quo