When migrating to a cloud and microservices architecture, companies need to invest in foundational capabilities, such as a microservices platform, continuous delivery, and an immutable infrastructure. In this talk, we will discuss our experience implementing these capabilities on the enterprise-scale with Google Cloud, Kubernetes, Istio, Envoy, Spinnaker, and Hashicorp stack. We will also discuss best practices of onboarding the cloud to facilitate DevOps, SRE without sacrificing quality or control.
2. Maxim Shishkarev
Sr. Solutions Architect @ Grid Dynamics
Cloud Enablement, DevOps and CICD automation
15+ years of experience in these areas and still enjoying it ;)
Family, Travel, Photography, Surfing
3. Introducing Grid Dynamics technology services
Digital transformation Big data, real time analytics, ML & AI
Microservices replatforming DevOps & cloud enablement
Open Source Cloud-ready Scalable Automated
6. Datacenter
Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Release
team
7. Datacenter
Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please?
Release
team
8. Datacenter
Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please?
Release
team
Sure. Tomorrow.
9. Datacenter
Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please? Sure. Tomorrow. Probably
Release
team
10. Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please? Sure. Tomorrow. Probably
Release
team
us-east
Enterprise
Data Centers
us-west
us-central
11. Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please?
Cloud
Sure. Tomorrow. Probably
Release
team
12. Self-service portal
(as seen by a developer)
Developer
(came to ask for a VM)
Cloud VMs
(carefully managed by infrastructure)
14. Web UI Search Checkout
Infra
team
Self-service portal
Network
team
OS
team
Security
team
Dev
team
QA
team
Can I have a VM please? Sure. Tomorrow. Probably
Cloud
Release
team
15. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Release
team
16. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
17. Application teams access
No access
• Cloud projects
• Access policies
• Core networking
• IAM policies
Debatable
• Subnets
• Firewalls
• OS
• Base Images
Has access
• VMs based on pre approved images
• Storage buckets
• Load balancers
• Firewalls within pre approved limits
• Other pre approved cloud services
18. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
19. .WAR
Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
Monolithic App
20. .WAR
Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
Monolithic App
21. .WAR
Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
Monolithic App
22. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
Cloud
Storage Network Other
API API API API
Policy (cost, access, security, other)
Release
team
25. Packaging Package
repo
Deployment
Logging & monitoring
Provisioning
Load balancing
Lifecycle management
(scaling, failover, etc.)
Service mesh
Service registry & discovery,
secret management
Business configuration
management
Microservices platform
26. Microservices platform reference technology stack
Feature Container-based VM-based
Packaging
Artifact repository
Deployment and provisioning
Load balancing and routing
Service mesh
Service registry and discovery
Secret management
Feature flags management
Resource management
Auto-scaling, self-healing
Logging and monitoring
Registry
27. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
RE
team
Cloud
Storage Network Other
API API API API
Microservices platform
API
Platform
team
Policy (cost, access, security, other)
28. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
RE
team
Cloud
Storage Network Other
API API API API
Microservices platform
API
Platform
team
Policy (cost, access, security, other)
29. Web UI Search Checkout
Infra
teamCompute
Network
team
OS
team
Security
team
Dev
team
QA
team
RE
team
Cloud
Storage Network Other
API API API API
Microservices platform
API
Platform
team
applications deploy themselves?
Policy (cost, access, security, other)
30. Application deployment package
Environment
Deployable unit
Build-time dependencies
Configuration
Deployment
scriptApplication artifact
Platform
& infra
teams
Development
engineers
QA
engineers
Deployment
engineers
Application can deploy itself
31. Application deployment package
Environment
Deployable unit
Build-time dependencies
Configuration
Deployment
scriptApplication artifact
Platform
& infra
teams
Development
engineers
QA
engineers
Deployment
engineers
Application can deploy itself
53. Platform & Tooling
Infrastructure
App1 v1.1
Configuration
Data
App2 v2.1
Configuration
Data
App3 v3.1
Configuration
Data
Interfaces
Interfaces
Interfaces
Interfaces
Code is broken
Wrong endpoint
Corrupted Data
Incompatible with App2 v2.1
Incorrect GC Config
Tested v3 only
Manually tweaked OS
Exposes /v2.1/ instead
of /v2/
Edge
Forgot rules for App3
Still warming-up
Interfaces
Built on a laptop
Create a ticket to get an
environment
Sent package via email
Sent config via
chat
Forgot to restart another service after
deployment Get configs from a
spreadsheet
Destroyed wrong env
Messed with Firewalls
VPN is downSuddenly out of quota or capacity
What could possibly go wrong? –Everything…
55. All changes to production should be authorized
1. Development lead should sign off
2. Functional QA lead should sign off
3. Performance QA lead should sign off
4. Security lead should sign off
5. Operations lead should sign off
6. Artifact deployed to production should be the same as tested in QA environment
58. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Code
review
All changes to production
should be authorized
59. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Code
review
Build,
code analysis,
unit testing
All changes to production
should be authorized
60. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Code
review
Build,
code analysis,
unit testing
Service
testing
All changes to production
should be authorized
Small QA
environment
61. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Search
v1.2
Code
review
Build,
code analysis,
unit testing
Service
testing
Deploy
All changes to production
should be authorized
Small QA
environment
62. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Search
v1.2
Code
review
Build,
code analysis,
unit testing
Service
testing
Deploy
All changes to production
should be authorized
Integration testing
Small QA
environment
63. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Search
v1.2
Code
review
Build,
code analysis,
unit testing
Service
testing
Deploy
All changes to production
should be authorized
Integration testing
UAT
Small QA
environment
64. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Search
v1.2
Code
review
Build,
code analysis,
unit testing
Service
testing
Deploy
All changes to production
should be authorized
Integration testing
UAT
Canary release (1% traffic)
Small QA
environment
65. Production deployment sign offs
Dev lead
QA lead
Perf QA lead
Business (product manager)
Ops lead
Security lead
Artifact wasn’t tampered with
Source
code
Production
Web UI
Search
v1.1
Search
v1.2
Code
review
Build,
code analysis,
unit testing
Service
testing
Deploy
All changes to production
should be authorized
Integration testing
UAT
Canary release (1% traffic)
Full release
Small QA
environment
1 hour
69. Capabilities for enterprise cloud, DevOps, and SRE
Organization Technology Process
DevOps culture and skills
Site reliability engineering
Service-oriented organization
Infrastructure as a service
Cross-functional teams
Microservices architecture
Continuous delivery platform
Chaos engineering
Immutable infrastructure
AI/ML for operations
Microservices platform
Policy-driven CICD
Testing in production
Single environment
Ultra-light change management
Change-driven design
Covered
Not covered