Companies are rapidly migrating to the cloud to take advantage of the simplified maintenance, ease of use, and scalability that providers like AWS and GCP offer. What many neglect to consider is that a cloud migration often requires an overhaul of company culture, technical capabilities, and team structure to be successful.
This presentation details the cultural shift that the Bankrate team at Red Ventures underwent during their journey to the cloud and shares some considerations that teams should make when going through similar journeys.
3. Today’s Talk
● Why companies move to the cloud
● Implications on culture, org structure, and capabilities
● Bankrate.com’s migration to the cloud
● What we’ve learned - Mistakes, wins, and our evolution
3
7. Our Journey to
Modernization
7
2015
Siloed & Heroic
Separate Engineering and Ops
teams - All on-call responsibilities
fall on Ops
First AWS Workload
Small applications built from
scratch - Architected for the
cloud
2016
Organizational Iteration
Continuous learning and
iteration toward a decentralized
organizational model
2018
Self-Service
Engineering teams fully
managing their own
infrastructure
2019
8. 2015: Not much
better than 1996
Hand-Rolled Infrastructure
All changes were made
manually by humans. Zero
scaling.
“I need a server built.”
Manual & Heroic Releases
Deployments done weekly
Releases would literally
break the site for 30
minutes.
Unbalanced Responsibilities
Ops got all of the dirty work:
● Long deployments
● Late nights
● On-call
● Frequent data center
trips
8
9. Engineering
● Write all new code for product
development needs
● Work with business
stakeholders to define
deliverables and timeframes
● Determine what should be
deployed and when
Team Responsibilities
Technical Operations
● Deploy code from Engineering
● Provision servers for new
systems
● Manage on-call duties for all
production issues
● Hands-on data center work
9
10. A Typical Deployment
1. Open deployment ticket in TFS two days in advance
2. Engineer emails compiled DLL file to Ops Team
3. Ops manually RDPs into each production server, removes
it from a load balancer, and drags/drops the DLL
○ Certain workloads required this for 35+ servers*
4. After manual verification, the server is added back to the
load balancer
* If a deployment was unsuccessful, the same process was required to rollback
the code, often times for 35+ servers again.
10
11. Our Journey to
Modernization
11
2015
Siloed & Heroic
Separate Engineering and Ops
teams - All on-call responsibilities
fall on Ops
First AWS Workload
Small applications built from
scratch - Architected for the
cloud
2016
Organizational Iteration
Continuous learning and
iteration toward a decentralized
organizational model
2018
Self-Service
Engineering teams provisioning
and managing their own
infrastructure
2019
12. 2016: New Workloads
in the Cloud
Team Formalization
Limited organizational
cloud competency
Created a small team (3)
to establish foundation
Porting of New Services
Rebuild of Bankrate.com,
CMS, portal, etc.
Transitioned to more than
AWS - Fastly, Github,
DataDog, etc.
Cultural Evolution
Partnership between
Engineering and Operations
Instrumentation as a first-
class citizen
12
13. Key Early Decisions
● All new systems in the cloud
● Start with (and require) automation
● Start small and build expertise
● Limit service sprawl
● Use SaaS until you outgrow it
13
14. Team Responsibilities
Engineering
● Develop new products
& features
● Partner with business
to define deliverables
● Create CI pipelines w/
Cloud support
Data Center
● Manages data center
environment
● Limited number of new
workloads / apps
● Deploys all data center
application changes
● Manages on-call for
data center systems
Cloud Engineering
● Creates core cloud
infrastructure
● Set cloud standards
(policies, automation, etc.)
● Deploys all application &
infrastructure changes
● Manages on-call for
cloud systems
14
15. ● High organizational risk
● Not everyone will like this evolution
● Rip off the band-aid - Live in this world for as short of a
time as possible
● Invest in training and real-world tasks
Hybrid Team Structure
Separate Data Center & Cloud Engineering
Teams
15
16. Pros
● Learned pain points and set
standards early
● Deep expertise in a small group
● Needed focus on our foundation
(VPC, Networking, Security
Groups, etc.)
Org Structure - Pros & Cons
Infrastructure managed by Cloud Engineering
Cons
● Limited engineering exposure to
infrastructure
● Cloud was still a bottleneck
● Monitoring and deploying
applications was still centralized
● Moved many of our problems
elsewhere
16
17. Our Journey to
Modernization
17
2015
Siloed & Heroic
Separate Engineering and Ops
teams - All on-call responsibilities
fall on Ops
First AWS Workload
Small applications built from
scratch - Architected for the
cloud
2016
Organizational Iteration
Continuous learning and
iteration toward a decentralized
organizational model
2018
Self-Service
Engineering teams provisioning
and managing their own
infrastructure
2019
18. Organizational Iteration
Embedding Cloud Eng.
● SPOF
● Temporary solution
● Train teams, then go
elsewhere
Joining Sprint Planning
● Improved visibility
● Prioritization was still a
struggle
● Neglected platform
priorities
Scrum in Cloud Eng.
● Improved planning
● Didn’t help ad-hoc
work
● Still fighting for priority
across teams
18
21. How We Shifted Responsibilities
Fast Lanes
Incentive for engineers to
contribute to their application
infrastructure.
Pair Programming
Pair on all new tasks to
broaden knowledge.
Containerization
Put configuration in the hands
of developers.
21
Engineering Champions
Identified a point person on
each team to evangelize the
change.
Shifting On-Call
Bring on-call into the hands of
those who can fix the issues.
Shifting Deployments
Empower engineers to deploy
their own changes.
22. Mostly Lift & Shift
Re-architecting
applications would have
(and has) taken years
Costly to absorb upfront
The Migration
6 Months - Team of 5
Visibility Improvements
Scaled down over time as
the metrics stabilized
Costs still remained high,
but risks were significantly
decreased
Training Opportunity
Real-world, hands-on
examples
22
23. Our Journey to
Modernization
23
2015
Siloed & Heroic
Separate Engineering and Ops
teams - All on-call responsibilities
fall on Ops
First AWS Workload
Small applications built from
scratch - Architected for the
cloud
2016
Organizational Iteration
Continuous learning and
iteration toward a decentralized
organizational model
2018
Self-Service
Engineering teams provisioning
and managing their own
infrastructure
2019
24. Product Engineering
● Full-stack (FE, BE, Infra, DB, Edge)
● Develop new tools / experiences
alongside Product
● On-Call for their applications
● Stakeholders: Business Leaders,
Marketing, Content
Team Responsibilities
Platform Engineering
● Cloud Infrastructure /
Foundation
● Platforms for Engineering Teams
to Self-Service Infrastructure
● OS Configuration / Automation
● Stakeholders: Engineering,
Security, Data
24