The document discusses autoscaling in Kubernetes. It describes three levels of scaling: Vertical Pod Autoscaler (VPA), Horizontal Pod Autoscaler (HPA), and Cluster Autoscaler (CA). The HPA scales deployments based on metrics like CPU and memory usage. The VPA can automatically adjust pod resource requests and limits. The CA automatically adjusts the Kubernetes cluster size across availability zones. An example is provided of using these tools to scale a game studio's Trainstation 2 workload based on queue size and database utilization metrics.
3. www.pixelfederation.com
Autoscaling in Kubernetes
TL;DR Summary
● Autoscaling - do we need it?
● Three levels of scaling - VPA, HPA, CA
● Cluster Autoscaler
● Horizontal Pod Autoscaler
● Vertical Pod Autoscaler
● Custom and External metrics for scaling
● Real life example
9. www.pixelfederation.com
Autoscaling in Kubernetes
Cluster Autoscaler
Cluster Autoscaler automatically adjusts the size of the Kubernetes cluster cross AZ:
● Watches for pod in pending state events due to insufficient resources.
● Periodically check for underutilized nodes with pods that can be placed on other existing
nodes.
● Respects PodDistributionBudget, Affinity, Annotation, ...
How we use it:
● MinReplicas of 2 with podAntiAffinity to hostname
● Parameters:
○ scale-down-delay-after-add: 10m
○ scale-down-delay-after-delete: 10s
○ scale-down-unneeded-time: 10m
○ scale-down-utilization-threshold: 0.65
● Spot instances
10. www.pixelfederation.com
Autoscaling in Kubernetes
Vertical Pod Autoscaler
● Can automatically adjust pod requests and limits
● Calculation based on current and historical metrics
● Modes: Auto, Recreate, Initial, Off
Cons:
● Pod restarts when request changes
(Auto/Recreate modes)
● All pods start events goes through VPA
● Could conflict with HPA (on CPU and
memory)
Pros:
● Recommender
● Can solve under or over provisioned
pods
How we use it: We don’t.
11. www.pixelfederation.com
Autoscaling in Kubernetes
Horizontal Pod Autoscaler
● Scale deployments (number of pods) on metrics base
○ Container resources - CPU/Memory
○ Object
○ Custom/External metrics
● Think twice to use it with stateful deployments
Take into consideration:
● Default metrics loop 15 sec
● Metric toleration 10%
● Downscale stabilization time window. The default value is 5 minutes (5m0s).
● Parameters prior Kubernetes 1.17 are configured on cluster level
● Since Kubernetes 1.18+ some parameters can be tweaked under HPA .spec.behavior
13. www.pixelfederation.com
Autoscaling in Kubernetes
Metrics types (autoscaling/v2beta2)
Resource:
● CPU/memory
● Container request(s) must be set
● API: metrics.k8s.io
External:
● Metrics not related to Kubernetes objects
● AWS SQS, RDS, …
● API: external.metrics.k8s.io
Custom:
● Pod/Object (in same namespace)
● Time series DB required (e.g. Prometheus)
● API: custom.metrics.k8s.io
Example with target average 65%:
ceil[currentReplicas * ( currentMetricValue /
desiredMetricValue )] = desiredReplicas
4*(0.781 / 0.650 ) = 4.8 (means +1 pod)
Note: With multiple metrics highest value is
chosen.
14. www.pixelfederation.com
Autoscaling in Kubernetes
Custom and External metrics
Limitation: One adapter per type Custom/External metrics or one for both
% kubectl get APIService v1beta1.external.metrics.k8s.io -o yaml
kind: APIService
metadata:
labels: …
name: v1beta1.external.metrics.k8s.io
spec:
group: external.metrics.k8s.io
service:
name: k8s-cloudwatch-adapter
namespace: ...
port: …
...
16. www.pixelfederation.com
Autoscaling in Kubernetes
Tested Adapters for HPA
● Prometheus adapter - we no longer use it
○ Doesn’t fit our needs any more - redesign of infrastructure needed
○ (https://github.com/kubernetes-sigs/prometheus-adapter)
● K8s-cloudwatch adapter - in use now
○ (https://github.com/awslabs/k8s-cloudwatch-adapter)
● Kube-metrics-adapter - evaluated
○ Collectors: Pod, Prometheus, AWS, HTTP, …
○ (https://github.com/zalando-incubator/kube-metrics-adapter)
● KEDA - evaluating now
○ (https://keda.sh)
17. www.pixelfederation.com
Autoscaling in Kubernetes
Trainstation 2 - Real Life Example
Types of workloads:
● Live traffic - changes over daytime
● Start/End of the Event - once per month
● Start/End of the Competitions - twice per week
● Event cleanup - a few days after event ends
Goals:
● Scaling backend - based on live traffic
● Batch/Asynchronous workers scaling - based on queue size
● Limit batch processing under DB pressure
● Maximize off peak hours batch processing
● Start of the Event/Competitions - scale ahead
18. www.pixelfederation.com
Autoscaling in Kubernetes
Trainstation 2 - Real Life Example
Cloudwatch adapter
Common approach:
1. Monitor queue
2. Calculate number of optimal
workers
3. Return metrics
Advanced approach:
1. Monitor queue
2. Calculate number of optimal
workers
3. Monitor RDS utilization
4. Tune number of workers to not
overload live workload
5. Return metrics
19. www.pixelfederation.com
Autoscaling in Kubernetes
Trainstation 2 - Real Life Example
apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
name: "ts2-numberOfMessagesSent"
spec:
name: name: "ts2-numberOfMessagesSent"
resource:
resource: "deployment"
queries:
- id: queue_metric
metricStat:
metric:
namespace: "AWS/SQS"
metricName: "NumberOfMessagesSent"
dimensions:
- name: QueueName
value: "ts2-demo"
period: 30
stat: Sum
unit: Count
returnData: false
- id: db_cpuutilization
metricStat:
metric:
namespace: "AWS/RDS"
metricName: "CPUUtilization"
dimensions:
- name: DBClusterIdentifier
value: "ts2-demo-cluster"
- name: Role
value: WRITER
period: 300
stat: Average
unit: Percent
returnData: false
- id: workers_calculated
expression: "IF((queue_metric / 300) > 100, 100, queue_metric / 300)"
returnData: false
- id: workers_desired
expression: "IF(db_cpuutilization < 80, workers_calculated,
IF(db_cpuutilization < 90, workers_calculated * 80 / 100, 0))"
returnData: true
Desired pods: get queue size -> calculate desired pods -> limit to 100 max -> if DB util. is more than 80% reduce 20%