SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Autoscaling your
Nomad jobs
stackconf online 2021
➔Used to be a Molecular Biologist,
Used to be a Molecular Biologist,
➔Then became a Dev,
Then became a Dev,
➔Now an Ops.
Now an Ops.
➔Currently
Currently CTO @ Hot Potatoes
CTO @ Hot Potatoes
Moving it all to the cloud
Vertical Scaling
Horizontal Scaling / Load Balancers
And than stuff got complicated….
Nomad
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
network {
port "http" {
to = 80
}
}
task "nginx" {
driver = "docker"
config {
image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19"
ports = ["http"]
Deploy the blog
job "blog" {
group "hugo" {
count = 2
service {
name = "blog"
tags = ["traefik.enable=true"]
port = "http"
check {
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
1 == None
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
constraint {
operator = "distinct_hosts"
value = "true"
}q
Force onto different hardware
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
Spread {
attribute = "${meta.rack}"
target "his" {
percent = 50
}
target "her" {
percent = 50
}
}
Suggest onto different hardware
/etc/nomad.d/config.hcl
Client {
Enabled = true
Meta {
"rack" = "his"
}
}
Based on custom meta-data
â—Ź
Introduced in/with Nomad 0.11
â—Ź
(Currently) independent release cycle
â—Ź
Gaining new functionality every release
â—Ź
Build in Functionality for horizontal and vertical scaling
â—Ź
Extendable by your own (community) plugins
Nomad-Autoscaler
â—Ź
Makes decisions based on a checks
â—Ź
Checks are a combination of
– Data queried from an APM
– Defined STRATEGY
– Attempt to approach TARGET value
â—Ź
Multiple Checks can be combined
â—Ź
Answer with the most resources will win!
â—Ź
ScaleOut and ScaleIn => ScaleOut
â—Ź
ScaleOut and ScaleNone => ScaleOut
â—Ź
ScaleOut(10) and ScaleOut(9) => ScaleOut(10)
Nomad-Autoscaler TLDR
job "autoscaler" {
type = "service"
group "autoscaler" {
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.3"
command = "nomad-autoscaler"
args = [
"agent",
"-config",
"${NOMAD_TASK_DIR}/config.hcl",
"-http-bind-address",
"0.0.0.0",
]
Deploy the autoscaler
/etc/nomad.d/config.hcl
nomad {
address = "http://{{env "attr.unique.network.ip-address" }}:4646"
}
apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
strategy "target-value" {
driver = "target-value"
}
Config for the autoscaler
Metrics
https://
prometheus.io/
group "hugo" {
count = 3
scaling {
enabled = true
min = 1
max = 20
policy {
cooldown = "20s"
check "avg_instance_sessions" {
source = "prometheus"
query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))"
strategy "target-value" {
target = 5
}
Enable autoscaling for the blog
Dashboards
https://grafana.com/oss/grafana/
Enable autoscaling
Observe scaling down event
Observe the autoscaler
agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value
target=local-nomad
agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value
target=local-nomad
agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab-
cabbc67fec1d source=prometheus strategy=target-value target=local-nomad
from=3 to=0 min=1 max=10
agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus
strategy=target-value target=local-nomad from=3 to=1 min=1 max=10
agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d
source=prometheus strategy=target-value target=local-nomad
target_config="map[group:demo job_id:webapp]" from=3 to=1 reason="capping
count to min value of 1"
Apply load
hey -z 1m -c 30 http://127.0.0.1:8000
Remove load
Logs
https://grafana.com/oss/loki/
group "autoscaler" {
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.3"
command = "nomad-autoscaler"
logging {
type = "loki"
config {
loki-url = 'http://loki.service.consul:3100/api/prom/push'
tag = "loki"
Directly to loki
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
task "promtail" {
driver = "docker"
lifecycle {
hook = "prestart"
sidecar = true
}
config {
image = "grafana/promtail:2.2.1"
args = [
"-config.file",
"${NOMAD_TASK_DIR}/promtail.yaml",
]
Promtail sidecar
${NOMAD_TASK_DIR}/promtail.yaml
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
task: autoscaler
__path__: /alloc/logs/autoscaler*
pipeline_stages:
- match:
selector: '{task="autoscaler"}'
stages:
- json:
expressions:
policy_id: '"@policy_id"'
source: '"@source"'
strategy: '"@strategy"'
target: '"@target"'
group: '"@group"'
job: '"@job"'
namespace: '"@namespace"'
Promtail sidecar
https://grafana.com/docs/loki/latest/clients/promtail/
Annotate your graphs
Correlate events with metrics
https://learn.hashicorp.com/tutorials/
nomad/autoscaler-vagrant-demo?
in=nomad/ecosystem
Try it yourself
Moving it all to the cloud *
apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
target "aws-asg" {
driver = "aws-asg"
config = {
aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }}
{{ slice $x 0 $length}}"
}
}
Grow into your platform
scaling "cluster_policy" {
policy {
cooldown = "2m"
evaluation_interval = "1m"
check "cpu_allocated_percentage" {
source = "prometheus"
query = "scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/
(nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{node_class="hashistack"}))/
count(nomad_client_allocated_cpu{node_class="hashistack"}))"
strategy "target-value" {
target = 70
}
}
target "aws-asg" {
dry-run = "false"
aws_asg_name = "${client_asg_name}"
node_class = "hashistack"
node_drain_deadline = "5m”
Grow into your platform
Observe the autoscaler again
agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-
bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg
query=scalar(sum(nomad_client_allocated_memory{node_class="hashistack"}*100/
(nomad_client_unallocated_memory{node_class="hashistack"}+nomad_client_allocated_memory{node_class="hashistack"}))/
count(nomad_client_allocated_memory))
agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-
bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg count=1 metric=95.17948717948718
agent.worker.check_handler: scaling target:
check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-
bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg
from=1 to=2 reason="scaling up because factor is 1.359707" meta=map[]
internal_plugin.aws-asg: successfully performed and verified scaling out:
action=scale_out asg_name=hashistack-nomad_client desired_count=2
agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage
policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg desired_count=2
https://github.com/hashicorp/nomad-
autoscaler/tree/master/demo/remote
Try it yourself
Moving it all to the cloud – QED
bram@attachmentgenie.com
@attachmentgenie
slideshare.net/attachmentgenie
Thank You

Weitere ähnliche Inhalte

KĂĽrzlich hochgeladen

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

KĂĽrzlich hochgeladen (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

Empfohlen

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceChristy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slidesAlireza Esmikhani
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming LanguageSimplilearn
 

Empfohlen (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

stackconf 2021 | Autoscaling with HashiCorp Nomad

  • 2. âž”Used to be a Molecular Biologist, Used to be a Molecular Biologist, âž”Then became a Dev, Then became a Dev, âž”Now an Ops. Now an Ops. âž”Currently Currently CTO @ Hot Potatoes CTO @ Hot Potatoes
  • 3. Moving it all to the cloud
  • 5. Horizontal Scaling / Load Balancers
  • 6. And than stuff got complicated….
  • 8. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { network { port "http" { to = 80 } } task "nginx" { driver = "docker" config { image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19" ports = ["http"] Deploy the blog
  • 9. job "blog" { group "hugo" { count = 2 service { name = "blog" tags = ["traefik.enable=true"] port = "http" check { type = "tcp" interval = "10s" timeout = "2s" } } 1 == None
  • 10. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 constraint { operator = "distinct_hosts" value = "true" }q Force onto different hardware
  • 11. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 Spread { attribute = "${meta.rack}" target "his" { percent = 50 } target "her" { percent = 50 } } Suggest onto different hardware
  • 12. /etc/nomad.d/config.hcl Client { Enabled = true Meta { "rack" = "his" } } Based on custom meta-data
  • 13. â—Ź Introduced in/with Nomad 0.11 â—Ź (Currently) independent release cycle â—Ź Gaining new functionality every release â—Ź Build in Functionality for horizontal and vertical scaling â—Ź Extendable by your own (community) plugins Nomad-Autoscaler
  • 14. â—Ź Makes decisions based on a checks â—Ź Checks are a combination of – Data queried from an APM – Defined STRATEGY – Attempt to approach TARGET value â—Ź Multiple Checks can be combined â—Ź Answer with the most resources will win! â—Ź ScaleOut and ScaleIn => ScaleOut â—Ź ScaleOut and ScaleNone => ScaleOut â—Ź ScaleOut(10) and ScaleOut(9) => ScaleOut(10) Nomad-Autoscaler TLDR
  • 15. job "autoscaler" { type = "service" group "autoscaler" { task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.3" command = "nomad-autoscaler" args = [ "agent", "-config", "${NOMAD_TASK_DIR}/config.hcl", "-http-bind-address", "0.0.0.0", ] Deploy the autoscaler
  • 16. /etc/nomad.d/config.hcl nomad { address = "http://{{env "attr.unique.network.ip-address" }}:4646" } apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } strategy "target-value" { driver = "target-value" } Config for the autoscaler
  • 18. group "hugo" { count = 3 scaling { enabled = true min = 1 max = 20 policy { cooldown = "20s" check "avg_instance_sessions" { source = "prometheus" query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))" strategy "target-value" { target = 5 } Enable autoscaling for the blog
  • 22. Observe the autoscaler agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab- cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=0 min=1 max=10 agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=1 min=1 max=10 agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad target_config="map[group:demo job_id:webapp]" from=3 to=1 reason="capping count to min value of 1"
  • 23. Apply load hey -z 1m -c 30 http://127.0.0.1:8000
  • 26. group "autoscaler" { task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.3" command = "nomad-autoscaler" logging { type = "loki" config { loki-url = 'http://loki.service.consul:3100/api/prom/push' tag = "loki" Directly to loki docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
  • 27. task "promtail" { driver = "docker" lifecycle { hook = "prestart" sidecar = true } config { image = "grafana/promtail:2.2.1" args = [ "-config.file", "${NOMAD_TASK_DIR}/promtail.yaml", ] Promtail sidecar
  • 28. ${NOMAD_TASK_DIR}/promtail.yaml scrape_configs: - job_name: system static_configs: - targets: - localhost labels: task: autoscaler __path__: /alloc/logs/autoscaler* pipeline_stages: - match: selector: '{task="autoscaler"}' stages: - json: expressions: policy_id: '"@policy_id"' source: '"@source"' strategy: '"@strategy"' target: '"@target"' group: '"@group"' job: '"@job"' namespace: '"@namespace"' Promtail sidecar https://grafana.com/docs/loki/latest/clients/promtail/
  • 32. Moving it all to the cloud *
  • 33. apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } target "aws-asg" { driver = "aws-asg" config = { aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }} {{ slice $x 0 $length}}" } } Grow into your platform
  • 34. scaling "cluster_policy" { policy { cooldown = "2m" evaluation_interval = "1m" check "cpu_allocated_percentage" { source = "prometheus" query = "scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/ (nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{node_class="hashistack"}))/ count(nomad_client_allocated_cpu{node_class="hashistack"}))" strategy "target-value" { target = 70 } } target "aws-asg" { dry-run = "false" aws_asg_name = "${client_asg_name}" node_class = "hashistack" node_drain_deadline = "5m” Grow into your platform
  • 35. Observe the autoscaler again agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e- bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg query=scalar(sum(nomad_client_allocated_memory{node_class="hashistack"}*100/ (nomad_client_unallocated_memory{node_class="hashistack"}+nomad_client_allocated_memory{node_class="hashistack"}))/ count(nomad_client_allocated_memory)) agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e- bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg count=1 metric=95.17948717948718 agent.worker.check_handler: scaling target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e- bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg from=1 to=2 reason="scaling up because factor is 1.359707" meta=map[] internal_plugin.aws-asg: successfully performed and verified scaling out: action=scale_out asg_name=hashistack-nomad_client desired_count=2 agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=aws-asg desired_count=2
  • 37. Moving it all to the cloud – QED