Talk is hosted via https://www.jonthebeach.com/schedule/1683849600#85
How to efficiently build and manage hundreds of Kubernetes Clusters that serve modern online analytics databases, for different customers? To add to the challenge, what if customers need to run their own clusters inside their own private clouds? We are sharing our system design that solves it.
How to provide fully managed online analytics databases like Pinot to hundreds of customers, while those Pinot clusters are running in each customer’s own private virtual cloud? The answer is by combining the power of Kubernetes with our automated scalable architecture that can fully manage a fleet of Kubernetes clusters.
When companies consider using SaaS (Software as a Service) products, they are often held back by challenges like security considerations and storage compliance regulations. Those concerns often require that the data stays within the same virtual cloud owned by the company. And it makes managed solutions very hard for companies to implement.
In StarTree we have built a modern data infrastructure based on Kubernetes so companies can keep their data inside their own infrastructure, and at the same time get the benefits of using a fully managed Apache Pinot cluster deployed in the customer’s cloud environment.
We have designed a scalable system based on Kubernetes that enables remote creation, maintenance, and monitoring of hundreds of Kubernetes clusters from different companies. This enabled us to scale quickly from a handful of deployments to over 100+ Pinot clusters in a short time span with just 10+ engineers.
4. We Do Delegated Management Solution
● K8S Owned by Customer
● Data Stays inside Customer’s
Virtual Private Cloud
● Fully Managed by Us
5. Design Context throughout the Talk
The 3 Major Constraints
● Cloud Boundaries
● Optimized for Apache Pinot
● Scale to hundreds or more
We will focus on how these 3 makes our system special
6. How do we design such a system?
(My job is safe from ChatGPT ... for now)
7. The journey: design such a system
• We are going to start small, automate, and dive deeper
• Always think about our context: customer’s cloud, our backend
8. Step 1: Creating the Clusters
• Each customer will be able to create and see their own clusters
• Self-serve provisioning via UI
• Multi-cloud support (AWS, GCP, Azure)
9. Step 1: Provisioning
The Manual Way
Automate this!
● Log into AWS UI by credentials provided by the customer
● Create Account, Networking, Kubernetes Cluster
● ❌ Bash script the aws eks creation
● ✅ Write your own microservice
- Use aws client library
- Terraform
12. Step 2: Installing Applications
Goal: The customers needs to access their clusters with Pinot Running
13. Step 2: Installing Applications
The Manual Way
Automate This
● ❌ kubectl apply -f all-apps.yaml
● ✅ helm upgrade --install startree-platform …
● Build our own helm charts
● Run our own private helm repo (or pay for AWS ECR)
● All applications deployed via Helm Chart
● Call helm libraries in our code
14. K8S Cluster Runs as a Platform, Applications are Pluggable
Charts and docker owned by separate teams 😍
15. Step 3: Networking
A huge topic worth a dedicated session
Public facing vs. “Internal” facing (VPC Peering)
Kubernetes Has Good Network Modeling and EcoSystems
● Ingress - We choose Traefik, easy for teams to define ingress
● LoadBalancer by Each Cloud Provider
● ExtraVPC Peering on demand
● Multi-Zone High Availability
16. Step 4: TLS and Certificates - Problem
Secure connection is required nearly everywhere
● Even withinVPC/Firewall customers request it
● Manual certificate generation will not scale
Certificate has expiration dates
● Automated renewal is needed
● First Time Creation == Future Renewal
17. Step 4: TLS and Certificates - Knowledge
Facts of Certificates
- Proves that you own this DNS name properly
- To generate certificate, we need to do DNS related challenge to prove ownership
- Established by chain of trusts
- Issued by well-known/pre-installed 3rd party issuers like ZeroSSL
18. Step 4: TLS and Certificates: Centralized
Option 1: Centralized solution
✅ Better Security
❌ Harder to Scale
19. Step 4: TLS and Certificates: Distributed
Decentralized Certificate Renewal
❌ Less Secure
✅ Easier to Scale Up
20. Special Part for Delegated Management Solution
Step 5,6,7…
The Usual DevOps stuff
● OIDC for AuthZ/AuthN
● Prometheus + AlertManager for Observability
● Logging, Debugging
● Backup and Disaster Recovery
● Metrics push to centralized monitoring and/or customer’s metrics storage
● Backup to customer’s deep store
21. Checkpoint 1: Kubernetes Fleet Management
Architecture So Far A mini version of multi-cloud Kubernetes fleet
management system, like the KubeSphere
23. Configuration/Customization
Templated Environment Creation
● Some customers like to enable groovy in Query, some don't
● Customizations/Configurations are applied onto templates
● Customization are applied like aVisitor pattern in the old Design Patterns
24. Are we there yet?
“Ops” part of DevOps!
* Image courtesy https://devopedia.org/devops
26. Version and Upgrades (Cont’d)
The version matrix Lessons Learnt
● Create good release pipeline with tests
● Discipline: avoid releasing versions with
breaking changes
● Keep helm chart and image tag the
same as release version
27. Efficiency and Reliability
Efficiency and Reliability are key to Scale up
● Discipline in DevOps is important
● No architecture is bulletproof
● Less Outages == Better Efficiency
● DevOps are created for end to end ownership
28. Efficiency and Reliability - Cont’d
Best Practices
● Build Good Infra Integration/Regression Test
● Trunk-Based Release Pipelines
○ Always release from master
○ Say no for release branches
● Do not customize by Kubectl command
29. Operations and OnCalls
There is no silver bullet for OnCall
• Discipline and Process
- Root Cause every outage
- Follow up on every outage
• Effective Alerts
- Differentiate alerts from signals
- Review and Keep Improving
- Build metrics to measure effectiveness
30. Lessons Learnt
Security design in Provisioned Cluster is hard
• Centralized Control, less Scalability
• Decentralized Control, harder to protect credentials
• Build good debugging support on TLS certificates
Do not run complicated Terraforms
• Bugs if state gets complicated, unwanted recreation
• Internal states of terraform are hard to keep track
31. Lessons Learnt (cont’d)
Certificate Issuer like ZeroSSL may partially go down for half a day
• No new customer can onboard during that downtime
One 3rd Party Helm Repo goes down and blocks customer cluster upgrade
• Serve Helm Charts by your own repo like JFrog