***Project Summary***
A well established SaaS company in North America recently migrated workloads of 50,000 Virtual Servers, Five (5) petabytes of data with MySQL database backend from on-premises data center infrastructure to Google Cloud Platform (GCP) through a 'lift and shift' cloud migration methodology.
They are looking to expand their SaaS offering and customer base outside of North America and at the same time optimize cloud platform for High Availability, Scalability, and Resilience.
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
SaaS company in north america
1. 1
Optimize Cloud Platform
Design a Globally Available Cloud Architecture
(High Availability, Scalability, and Resilience)
October 2018
2. 2
About Me
Gabe Akisanmi
• Digital Transformation Consultant
• 14+ Yrs. In Technology Industry, @CenturyLink
(Ex. Hosting.com, KPMG, Sungard, NetApp)
• Skilled in Project Management, Strategic Planning & Cloud
• Outside Interest: Soccer, Storytelling, YouTube
3. 3
• Executive Overview
• Business & Technical Requirements
o Architecture for High Availability, Scalability and Resilience
o Incident Management
o Training Plan
Agenda
4. 4
Executive Overview
• Completed 18 months Cloud migration (lift & shift)
• No Autoscaling
• Monolithic SaaS Architecture
• Expand SaaS offering and customer base outside of North
America
5 PB
Storage
~50K
Virtual Servers
1
GCP Zone
5. 5
Business & Technical Requirements
• Design a High Available, Scalable and Resilient cloud infrastructure
• Design a Globally Available Cloud Architecture
• Plan for Incident Management
• Propose a Training Plan for Infrastructure and development team
6. 6
High Availability, Scalability and Resilience
Zone 1 : us-west1-a
Cloud
DNS
Application Server
Compute Engine
Instances Group
Autoscaling
Master
Database Server
Cloud SQL For MySQL
Zone 2 : us-west2-b
Application Server
Compute Engine
Instances Group
Autoscaling
Failover Replica
Database Server
Cloud SQL For MySQL
Zone 1 : europe-west1-a
Application Server
Compute Engine
Instances Group
Autoscaling
Master
Database Server
Cloud SQL For MySQL
Zone 2 : europe-west1-b
Application Server
Compute Engine
Instances Group
Autoscaling
Failover Replica
Database Server
Cloud SQL For MySQL
ReplicationReplication
Cloud
Network
GCP internal
Network Backbone
inter-region
communication
Global Load
Balancing
(Anycast Global IP)
US Customers
UK Customers
Cloud
DNS
7. 7
Incident Management
• Phase 1. - Identify and declare an incident
• Phase 2. - Incident Response & Team
• Phase 3. - Incident Resolution
• Phase 4. – Closure & Root Cause Analysis
Google Cloud Status Dashboard
https://status.cloud.google.com/summary
8. 8
Training
• Self-Paced Training
https://www.coursera.org/go
oglecloud
• QwikLABS
https://qwiklabs.com
• Multiple Tracks
• GCP Community
https://cloud.google.com/co
mmunity/
Learning Tracks
Cloud
Infrastructure
Designed for IT professionals
responsible for implementing,
deploying, migrating and maintaining
applications in the cloud.
https://cloud.google.com/training/cloud-infrastructure/
Application
Development
Designed for application programmers
and software engineers who develop
software programs in the cloud
https://cloud.google.com/training/application-
development/
Data and
Machine
Learning
Designed for data professionals who
are responsible for designing, building,
analyzing, and optimizing big data
solutions.
https://cloud.google.com/training/data-ml/
Your Gateway
to GCP
For IT professionals seeking an
introduction to Google Cloud Platform
https://cloud.google.com/training/onramps
G Suite
Administration
Designed for administrators who are
responsible for deploying,
administering and extending the
Google G Suite applications.
https://cloud.google.com/training/admin