Put the ‘Auto’ in Autoscaling – Make Kubernetes VPA and HPA work together for optimal resource provisioning,

Put the ‘Auto’ in Autoscaling
Make Kubernetes VPA and HPA work together
for optimal resource provisioning
Niels Roetert
Solutions Architect
Johanna Luetgebrune
Business Development

The Kubernetes Efficiency Challenge
● Configuring applications to run efficiently on
Kubernetes is difficult
● Developers often have to guess at resource
settings or end up using defaults
● Platform teams are left to manage
resources without knowing the needs of the
application
● “Do more with less” is the new reality, but
teams feel they must choose between
reliability and cost

Requests and Limits?
● Requests: The minimum resources (CPU and memory) a container is guaranteed.
● Limits: The maximum resources a container can consume before being throttled or
terminated.
● Relevance: They ensure efficient resource distribution and prevent resource contention
in a cluster.
● Usage: They help maintain application performance, stability, and prevent
over-provisioning or under-provisioning of resources.
3
Requests and limits are Kubernetes resource settings for containers,
they manage CPU and memory allocation.

targetUtilization
The targetUtilization field in the HPA config specifies the
desired resource utilization percentage. The HPA scales
the pod replicas up or down based on the observed
resource utilization to maintain the target value,
ensuring efficient resource usage and optimal
performance.
4
Kubernetes HPA automatically scales the number of pod replicas
based on observed metrics, like CPU utilization, to maintain optimal
resource usage and application performance.

What are Kubernetes Autoscalers
● Vertical Pod Autoscaler (VPA): Adjusts container CPU and memory requests and limits
based on usage, optimizing resource allocation.
● Horizontal Pod Autoscaler (HPA): Scales pod replicas based on CPU, memory, or
custom metrics, maintaining application performance and availability.
● Cluster Autoscaler: Adds or removes nodes in the cluster based on resource demands
and utilization, ensuring efficient resource usage and cost-effectiveness.
5
The Vertical Pod Autoscaler (VPA), Horizontal Pod Autoscaler (HPA),
and Cluster Autoscaler are components in Kubernetes that help
manage and scale resources automatically based on workload
demands and cluster resource utilization.

In a Land of Rainbows and Unicorns
6
They have the knowledge of the application's
performance characteristics, resource usage patterns,
and architecture, which allows them to make informed
decisions about the appropriate resource allocation for
each container.
The development team or DevOps engineers are
responsible for setting the initial requests and
limits for containers in a Kubernetes deployment,
because:

In a Realm of Chaos and Dragons
● Analyze application requirements.
● Set up a baseline with default Pod/HPA settings.
● Conduct load testing and adjust Pod/HPA.
● Monitor performance and fine-tune Pod/HPA.
● Review and update Pod/HPA settings regularly.
7
Determining the best settings for Pods and the HPA target
utilization in Kubernetes environments for new applications
can be challenging.

Without autoscaling, over-provisioning or risk are inevitable
8
Scenario 1: Conservative & wasteful Scenario 2: Aggressive & risky

Most teams start with
horizontal pod
autoscaling (HPA)
10
What should my HPA
target utilization be
set to?
Are my pods the right size?
Am I just multiplying my
inefficiencies?
How am I going to
configure this across
hundreds of services?
Why is it called “auto”
scaling anyway?!

Cores
6
5.8
5.6
5.4
5.2
5
4.8
4.6
4.4
4.2
4
3.8
3.6
3.4
3.2
3
2.8
2.6
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
HPA still results in
waste
Actual Usage
Replicas
Wasted Resource
1
2
3
● CPU requests set lower, but
still high enough to reduce
risk as actual usage spikes
● Inefficient pods are
replicated, multiplying the
inefficiency
Replicas
11

StormForge Optimize Live Overview
12
● Reduces costs and improves reliability by
right-sizing Kubernetes application resources.
● Machine learning analyzes CPU and memory
utilization, and provides recommendations to
adjust resources requests up & down to meet
demand as patterns change
● Reduces toil by automatically applying
recommendations automatically, freeing up
engineering resources
● Enables bi-dimensional autoscaling,
providing vertical rightsizing and efficient
horizontal scaling through recommended
target utilization for HPA enabled workloads
● Low barrier to entry making it fast and easy
to get started

RECOMMEND
● CPU and memory
● HPA target utilization
DISCOVER
● Continuously ingest
workloads
● Machine Learning
analyzes Kubernetes
data
IMPLEMENT
● Automatic or manual
● Route through CI/CD
process
INSTALL
13
StormForge Optimize Live: How does it work?

UI displays immediate results for
optimization with Optimize Live
including metrics & cost details
Automated Business
Impact Across Your
Environment
14

StormForge
Optimize Live
15
DEMO

The StormForge difference
INTELLIGENCE
Actionable
recommendations to
optimize resources as
usage varies.
Unlike cloud cost management tools that merely provide visibility, StormForge uses:
16
AUTOMATION
to proactively and
continuously right-size -
improving efficiency &
eliminating cloud waste.
VISIBILITY
Show current utilization
and identify
opportunities for
improvement.
+ +

w w w . s t o r m f o r g e . i o
Niels Roetert Solutions Architect niels@stormforge.io
Johanna Luetgebrune Business Development johanna@stormforge.io

Put the ‘Auto’ in Autoscaling – Make Kubernetes VPA and HPA work together for optimal resource provisioning,

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Put the ‘Auto’ in Autoscaling – Make Kubernetes VPA and HPA work together for optimal resource provisioning,

Ähnlich wie Put the ‘Auto’ in Autoscaling – Make Kubernetes VPA and HPA work together for optimal resource provisioning, (20)

Mehr von QAware GmbH

Mehr von QAware GmbH (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Put the ‘Auto’ in Autoscaling – Make Kubernetes VPA and HPA work together for optimal resource provisioning,