SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Easy Cloud Native Transformation
with HashiCorp Nomad
Bram Vogelaar
@attachmentgenie
$ whoami
• Used to be a Molecular Biologist
• Then became a Dev
• Now an Ops
• Currently Cloud Engineer @ The Factory
• Amsterdam HUG organizer
Moving it all to the cloud
Vertical Scaling
Horizontal Scaling / Load Balancers
And than stuff got complicated….
The story starts with my personal website
Nomad
l Open Source tool for dynamic workload scheduling
l Batch, containerized, and non-containerized applications.
l Has native Consul and Vault integrations.
l Has token based access setup.
l Jobs written in (H)ashiCorp (C)onfiguration (L)anguage
https://www.nomadproject.io/
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
network {
port "http" {
to = 80
}
}
task "nginx" {
driver = "docker"
config {
image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19"
ports = ["http"]
Deploy the blog
1 == None
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
constraint {
operator = "distinct_hosts"
value = "true"
}
Force onto different hardware
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
Spread {
attribute = "${node.datacenter}"
}
Suggest onto different hardware
/etc/nomad.d/config.hcl
Client {
Enabled = true
Meta {
"rack" = "his"
}
}
Based on custom meta-data
job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 2
Spread {
attribute = "${meta.rack}"
target "his" {
percent = 50
}
target "her" {
percent = 50
}
}
Based on custom meta-data
service {
name = ”blog"
provider = "nomad"
port = ”http"
}
Service Definition
template {
data = <<EOH
http {
server {
listen 80;
location / {
{{ range nomadService ”blog" }}
proxy_pass http://{{ .Address }}:{{ .Port }};
{{ end }}
}
}
}
EOH
destination = "local/api-servers"
Service Usage
Nomad Pack
• Levant
• Templating and packaging tool
• Easily deploy popular applications to Nomad
• Re-use common patterns across internal applications
• Find and share job specifications with the Nomad community
• Nightlies only right now!
https://github.com/hashicorp/nomad-pack-community-registry
Nomad Pack
• nomad-pack registry list
• nomad-pack run hello_world
• nomad-pack run hello_world --var message=hola
https://github.com/hashicorp/nomad-pack
https://github.com/attachmentgenie/vagrant-scheduler
Try it yourself
Consul
l Open-Source Service Discovery Tool
l Build-in KV store
l Service Mesh tool
https://www.consul.io/
service {
name = "blog"
port = "http"
check {
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
Service Definition
Check {
type = "tcp"
interval = "10s"
timeout = "2s"
check_restart {
limit = 3
grace = "10s"
ignore_warnings = false
}
}
Stampeding herd
group "hugo" {
Restart {
interval = "10m"
attempts = 2
delay = "15s"
mode = "fail"
}
task "nginx" {
Restart failed jobs
group "hugo" {
Count = 2
reschedule {
delay = "30s"
delay_function = "constant" #constant, exponential, fibonacci
unlimited = true # or max_delay = “1h”
}
task "nginx" {
Reschedule a job
• https://github.com/ThomasObenaus/dummy-services
• https://github.com/Shopify/toxiproxy
Testing your assumptions
group "hugo" {
Count = 10
update {
max_parallel = 2
min_healthy_time = "30s"
healthy_deadline = "5m"
}
task "nginx" {
Updates
group "hugo" {
Count = 10
Update {
max_parallel = 1
canary = 10
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = false
}
task "nginx" {
Blue/Green Release
group "hugo" {
Count = 5
Update {
max_parallel = 1
canary = 1
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = true
}
task "nginx" {
Canary Release
service {
name = "blog"
tags = ["v2"]
}
$version++
group "hugo" {
Count = 5
Update {
max_parallel = 1
canary = 1
min_healthy_time = "30s"
healthy_deadline = "10m"
auto_revert = true
auto_promote = false
}
task "nginx" {
Canary Release++
kind = "service-router"
name = "blog"
routes = [
{
match {
http {
header = [
{
name = "group"
exact = "test"
}, ] } }
destination {
service = "blog"
service_subset = "v2"
} },]
Consul to the rescue
● Introduced in/with Nomad 0.11
● (Currently) independently release cycle
● Gaining new functionality every release
● Build in Functionality for horizontal and vertical scaling
● But extendable by your own (community) plugins
Nomad autoscaler
● Makes decisions based on a checks
● Checks are a combination of
• Data queried from an APM
• Defined STRATEGY
• Attempt to approach TARGET value
● Multiple Checks can be combined
• Answer with the most resources will win!
• ScaleOut and ScaleIn => ScaleOut
• ScaleOut and ScaleNone => ScaleOut
• ScaleIn and ScaleNone => ScaleNone
• ScaleOut(10) and ScaleOut(9) => ScaleOut(10)
• ScaleIn(3) and ScaleIn(4) => ScaleIn(4)
Auto-scaling TLDR
• job "autoscaler" {
type = "service"
datacenters = ["aws"]
group "autoscaler" {
count = 1
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.6"
command = "nomad-autoscaler"
args = [
"agent",
"-config",
"${NOMAD_TASK_DIR}/config.hcl",
"-http-bind-address",
"0.0.0.0",
]
Deploy the autoscaler
• /etc/nomad.d/config.hcl
• nomad {
address = "http://{{env "attr.unique.network.ip-address" }}:4646"
}
apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
strategy "target-value" {
driver = "target-value"
}
Config for the autoscaler
Metrics
https://prometheus.io/
• job "blog" {
datacenters = ["aws"]
type = "service"
group "hugo" {
count = 3
scaling {
enabled = true
min = 1
max = 20
policy {
cooldown = "20s"
check "avg_instance_sessions" {
source = "prometheus"
query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))"
strategy "target-value" {
target = 5
}
Enable autoscaling for the blog
Dashboards
https://grafana.com/oss/grafana/
Enable autoscaling
Observe scaling down event
agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad
agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad
agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=
agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=1 min=1 max=10
agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value
Observe the autoscaler
hey -z 1m -c 30 http://127.0.0.1:8000
Apply load
Remove load
Logs
https://grafana.com/oss/loki/
group "autoscaler" {
count = 1
task "autoscaler" {
driver = "docker"
config {
image = "hashicorp/nomad-autoscaler:0.3.6"
command = "nomad-autoscaler"
logging {
type = "loki"
config {
loki-url = 'http://loki.service.consul:3100/api/prom/push'
tag = "loki"
}
}
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
Direct to loki
task "promtail" {
driver = "docker"
lifecycle {
hook = "prestart"
sidecar = true
}
config {
image = "grafana/promtail:2.5.0"
args = [
"-config.file",
"local/promtail.yaml",
]
Promtail sidecar
• scrape_configs:
- job_name: system
entry_parser: raw
static_configs:
- targets:
- localhost
labels:
task: autoscaler
__path__: /alloc/logs/autoscaler*
pipeline_stages:
- match:
selector: '{task="autoscaler"}'
stages:
- regex:
expression: '.*policy_id=(?P<policy_id>[a-zA-Z0-9_-]+).*source=(?P<source>[a-zA-Z0-9_-
]+).*strategy=(?P<strategy>[a-zA-Z0-9_-]+).*target=(?P<target>[a-zA-Z0-9_-]+).*Group:(?P<group>[a-zA-Z0-
9]+).*Job:(?P<job>[a-zA-Z0-9_-]+).*Namespace:(?P<namespace>[a-zA-Z0-9_-]+)'
https://grafana.com/docs/loki/latest/clients/promtail/
Promtail sidecar
Annotate your graphs
Correlate events with metrics
• apm "prometheus" {
driver = "prometheus"
config = {
address = "http://prometheus.service.consul:9090"
}
}
• target "aws-asg" {
driver = "aws-asg"
config = {
aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }}{{ slice $x 0 $length}}"
}
}
Grow into your platform
• scaling "cluster_policy" {
policy {
cooldown = "2m"
evaluation_interval = "1m"
check "cpu_allocated_percentage" {
source = "prometheus"
query =
"scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/(nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{n
ode_class="hashistack"}))/count(nomad_client_allocated_cpu{node_class="hashistack"}))"
strategy "target-value" {
target = 70
}
}
target "aws-asg" {
dry-run = "false"
aws_asg_name = "${client_asg_name}"
node_class = "hashistack"
node_drain_deadline = "5m”
Grow into your platform
agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=a
agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value ta
agent.worker.check_handler: scaling target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-36
internal_plugin.aws-asg: successfully performed and verified scaling out: action=scale_out asg_name=hashistack-n
agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus
Observe the autoscaler again
https://github.com/hashicorp/nomad-
autoscaler/tree/master/demo/remote
Try it yourself
Moving it all to the cloud – QED
Contact
bram@attachmentgenie.com
@attachmentgenie
https://www.slideshare.net/attachmentgenie
https://hashiconf.com/europe/ <= Running Trusted Payloads With Nomad and Waypoint
Questions ?
The Floor is yours…

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Vagrant
Introduction to VagrantIntroduction to Vagrant
Introduction to Vagrant
Marcelo Pinheiro
 
Vagrant
VagrantVagrant
Vagrant
Evans Ye
 

Was ist angesagt? (20)

Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Helm intro
Helm introHelm intro
Helm intro
 
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
 
Introduction to Vagrant
Introduction to VagrantIntroduction to Vagrant
Introduction to Vagrant
 
Vagrant 101 Workshop
Vagrant 101 WorkshopVagrant 101 Workshop
Vagrant 101 Workshop
 
Ansible - Hands on Training
Ansible - Hands on TrainingAnsible - Hands on Training
Ansible - Hands on Training
 
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaEmbedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
 
Helm.pptx
Helm.pptxHelm.pptx
Helm.pptx
 
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
 
Ansible Automation for Oracle RMAN / Apex Restores
Ansible Automation for Oracle RMAN / Apex RestoresAnsible Automation for Oracle RMAN / Apex Restores
Ansible Automation for Oracle RMAN / Apex Restores
 
Ansible tips & tricks
Ansible tips & tricksAnsible tips & tricks
Ansible tips & tricks
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
 
A Introduction of Packer
A Introduction of PackerA Introduction of Packer
A Introduction of Packer
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containers
 
Docker swarm
Docker swarmDocker swarm
Docker swarm
 
Vagrant
VagrantVagrant
Vagrant
 
Ansible
AnsibleAnsible
Ansible
 
Introduction to Rust
Introduction to RustIntroduction to Rust
Introduction to Rust
 
Vagrant
Vagrant Vagrant
Vagrant
 

Ähnlich wie Easy Cloud Native Transformation using HashiCorp Nomad

PDXPortland - Dockerize Django
PDXPortland - Dockerize DjangoPDXPortland - Dockerize Django
PDXPortland - Dockerize Django
Hannes Hapke
 
Puppetpreso
PuppetpresoPuppetpreso
Puppetpreso
ke4qqq
 

Ähnlich wie Easy Cloud Native Transformation using HashiCorp Nomad (20)

Autoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomadAutoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomad
 
Microservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-dockerMicroservices blue-green-deployment-with-docker
Microservices blue-green-deployment-with-docker
 
Cutting through the fog of cloud
Cutting through the fog of cloudCutting through the fog of cloud
Cutting through the fog of cloud
 
Large-scaled Deploy Over 100 Servers in 3 Minutes
Large-scaled Deploy Over 100 Servers in 3 MinutesLarge-scaled Deploy Over 100 Servers in 3 Minutes
Large-scaled Deploy Over 100 Servers in 3 Minutes
 
Scala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camouScala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camou
 
Easy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with NomadEasy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with Nomad
 
Advanced technic for OS upgrading in 3 minutes
Advanced technic for OS upgrading in 3 minutesAdvanced technic for OS upgrading in 3 minutes
Advanced technic for OS upgrading in 3 minutes
 
Toolbox of a Ruby Team
Toolbox of a Ruby TeamToolbox of a Ruby Team
Toolbox of a Ruby Team
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
 
Docker Security workshop slides
Docker Security workshop slidesDocker Security workshop slides
Docker Security workshop slides
 
PDXPortland - Dockerize Django
PDXPortland - Dockerize DjangoPDXPortland - Dockerize Django
PDXPortland - Dockerize Django
 
App development with quasar (pdf)
App development with quasar (pdf)App development with quasar (pdf)
App development with quasar (pdf)
 
DevOps Workflow: A Tutorial on Linux Containers
DevOps Workflow: A Tutorial on Linux ContainersDevOps Workflow: A Tutorial on Linux Containers
DevOps Workflow: A Tutorial on Linux Containers
 
Serverless in-action
Serverless in-actionServerless in-action
Serverless in-action
 
ILM - Pipeline in the cloud
ILM - Pipeline in the cloudILM - Pipeline in the cloud
ILM - Pipeline in the cloud
 
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
 
Puppetpreso
PuppetpresoPuppetpreso
Puppetpreso
 
How to create your own hack environment
How to create your own hack environmentHow to create your own hack environment
How to create your own hack environment
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Using HCP Waypoint
Using HCP WaypointUsing HCP Waypoint
Using HCP Waypoint
 

Mehr von Bram Vogelaar

Observability; a gentle introduction
Observability; a gentle introductionObservability; a gentle introduction
Observability; a gentle introduction
Bram Vogelaar
 

Mehr von Bram Vogelaar (20)

Cost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB worldCost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB world
 
Self scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsSelf scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloads
 
Scraping metrics for fun and profit
Scraping metrics for fun and profitScraping metrics for fun and profit
Scraping metrics for fun and profit
 
10 things I learned building Nomad packs
10 things I learned building Nomad packs10 things I learned building Nomad packs
10 things I learned building Nomad packs
 
Uncomplicated Nomad
Uncomplicated NomadUncomplicated Nomad
Uncomplicated Nomad
 
Observability; a gentle introduction
Observability; a gentle introductionObservability; a gentle introduction
Observability; a gentle introduction
 
Running Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and WaypointRunning Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and Waypoint
 
Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp Vault
 
CICD using jenkins and Nomad
CICD using jenkins and NomadCICD using jenkins and Nomad
CICD using jenkins and Nomad
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stack
 
Running trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and WaypointRunning trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and Waypoint
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
 
Puppet and the HashiStack
Puppet and the HashiStackPuppet and the HashiStack
Puppet and the HashiStack
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stack
 
Creating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesCreating Reusable Puppet Profiles
Creating Reusable Puppet Profiles
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
 
Observability with Consul Connect
Observability with Consul ConnectObservability with Consul Connect
Observability with Consul Connect
 
Testing your infrastructure with litmus
Testing your infrastructure with litmusTesting your infrastructure with litmus
Testing your infrastructure with litmus
 
Devops its not about the tooling
Devops its not about the toolingDevops its not about the tooling
Devops its not about the tooling
 
High Available Drupal
High Available DrupalHigh Available Drupal
High Available Drupal
 

KĂźrzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

KĂźrzlich hochgeladen (20)

Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Easy Cloud Native Transformation using HashiCorp Nomad

  • 1. Easy Cloud Native Transformation with HashiCorp Nomad Bram Vogelaar @attachmentgenie
  • 2. $ whoami • Used to be a Molecular Biologist • Then became a Dev • Now an Ops • Currently Cloud Engineer @ The Factory • Amsterdam HUG organizer
  • 3. Moving it all to the cloud
  • 5. Horizontal Scaling / Load Balancers
  • 6. And than stuff got complicated….
  • 7. The story starts with my personal website
  • 8. Nomad l Open Source tool for dynamic workload scheduling l Batch, containerized, and non-containerized applications. l Has native Consul and Vault integrations. l Has token based access setup. l Jobs written in (H)ashiCorp (C)onfiguration (L)anguage https://www.nomadproject.io/
  • 9. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { network { port "http" { to = 80 } } task "nginx" { driver = "docker" config { image = "${PRIVATE}.dkr.ecr.us-east-1.amazonaws.com/blog:19" ports = ["http"] Deploy the blog
  • 10. 1 == None job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2
  • 11. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 constraint { operator = "distinct_hosts" value = "true" } Force onto different hardware
  • 12. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 Spread { attribute = "${node.datacenter}" } Suggest onto different hardware
  • 13. /etc/nomad.d/config.hcl Client { Enabled = true Meta { "rack" = "his" } } Based on custom meta-data
  • 14. job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 2 Spread { attribute = "${meta.rack}" target "his" { percent = 50 } target "her" { percent = 50 } } Based on custom meta-data
  • 15. service { name = ”blog" provider = "nomad" port = ”http" } Service Definition
  • 16. template { data = <<EOH http { server { listen 80; location / { {{ range nomadService ”blog" }} proxy_pass http://{{ .Address }}:{{ .Port }}; {{ end }} } } } EOH destination = "local/api-servers" Service Usage
  • 17. Nomad Pack • Levant • Templating and packaging tool • Easily deploy popular applications to Nomad • Re-use common patterns across internal applications • Find and share job specifications with the Nomad community • Nightlies only right now! https://github.com/hashicorp/nomad-pack-community-registry
  • 18. Nomad Pack • nomad-pack registry list • nomad-pack run hello_world • nomad-pack run hello_world --var message=hola https://github.com/hashicorp/nomad-pack
  • 20. Consul l Open-Source Service Discovery Tool l Build-in KV store l Service Mesh tool https://www.consul.io/
  • 21. service { name = "blog" port = "http" check { type = "tcp" interval = "10s" timeout = "2s" } } Service Definition
  • 22. Check { type = "tcp" interval = "10s" timeout = "2s" check_restart { limit = 3 grace = "10s" ignore_warnings = false } } Stampeding herd
  • 23. group "hugo" { Restart { interval = "10m" attempts = 2 delay = "15s" mode = "fail" } task "nginx" { Restart failed jobs
  • 24. group "hugo" { Count = 2 reschedule { delay = "30s" delay_function = "constant" #constant, exponential, fibonacci unlimited = true # or max_delay = “1h” } task "nginx" { Reschedule a job
  • 26. group "hugo" { Count = 10 update { max_parallel = 2 min_healthy_time = "30s" healthy_deadline = "5m" } task "nginx" { Updates
  • 27. group "hugo" { Count = 10 Update { max_parallel = 1 canary = 10 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = false } task "nginx" { Blue/Green Release
  • 28. group "hugo" { Count = 5 Update { max_parallel = 1 canary = 1 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = true } task "nginx" { Canary Release
  • 29. service { name = "blog" tags = ["v2"] } $version++
  • 30. group "hugo" { Count = 5 Update { max_parallel = 1 canary = 1 min_healthy_time = "30s" healthy_deadline = "10m" auto_revert = true auto_promote = false } task "nginx" { Canary Release++
  • 31. kind = "service-router" name = "blog" routes = [ { match { http { header = [ { name = "group" exact = "test" }, ] } } destination { service = "blog" service_subset = "v2" } },] Consul to the rescue
  • 32. ● Introduced in/with Nomad 0.11 ● (Currently) independently release cycle ● Gaining new functionality every release ● Build in Functionality for horizontal and vertical scaling ● But extendable by your own (community) plugins Nomad autoscaler
  • 33. ● Makes decisions based on a checks ● Checks are a combination of • Data queried from an APM • Defined STRATEGY • Attempt to approach TARGET value ● Multiple Checks can be combined • Answer with the most resources will win! • ScaleOut and ScaleIn => ScaleOut • ScaleOut and ScaleNone => ScaleOut • ScaleIn and ScaleNone => ScaleNone • ScaleOut(10) and ScaleOut(9) => ScaleOut(10) • ScaleIn(3) and ScaleIn(4) => ScaleIn(4) Auto-scaling TLDR
  • 34. • job "autoscaler" { type = "service" datacenters = ["aws"] group "autoscaler" { count = 1 task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.6" command = "nomad-autoscaler" args = [ "agent", "-config", "${NOMAD_TASK_DIR}/config.hcl", "-http-bind-address", "0.0.0.0", ] Deploy the autoscaler
  • 35. • /etc/nomad.d/config.hcl • nomad { address = "http://{{env "attr.unique.network.ip-address" }}:4646" } apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } strategy "target-value" { driver = "target-value" } Config for the autoscaler
  • 37. • job "blog" { datacenters = ["aws"] type = "service" group "hugo" { count = 3 scaling { enabled = true min = 1 max = 20 policy { cooldown = "20s" check "avg_instance_sessions" { source = "prometheus" query = "scalar(avg(traefik_service_open_connections{service="blog@consulcatalog"}))" strategy "target-value" { target = 5 } Enable autoscaling for the blog
  • 41. agent: querying APM: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: calculating new count: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad agent: next count outside limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy= agent: updated count to be within limits: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value target=local-nomad from=3 to=1 min=1 max=10 agent: scaling target: policy_id=248f6157-ca37-f868-a0ab-cabbc67fec1d source=prometheus strategy=target-value Observe the autoscaler
  • 42. hey -z 1m -c 30 http://127.0.0.1:8000 Apply load
  • 45. group "autoscaler" { count = 1 task "autoscaler" { driver = "docker" config { image = "hashicorp/nomad-autoscaler:0.3.6" command = "nomad-autoscaler" logging { type = "loki" config { loki-url = 'http://loki.service.consul:3100/api/prom/push' tag = "loki" } } docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions Direct to loki
  • 46. task "promtail" { driver = "docker" lifecycle { hook = "prestart" sidecar = true } config { image = "grafana/promtail:2.5.0" args = [ "-config.file", "local/promtail.yaml", ] Promtail sidecar
  • 47. • scrape_configs: - job_name: system entry_parser: raw static_configs: - targets: - localhost labels: task: autoscaler __path__: /alloc/logs/autoscaler* pipeline_stages: - match: selector: '{task="autoscaler"}' stages: - regex: expression: '.*policy_id=(?P<policy_id>[a-zA-Z0-9_-]+).*source=(?P<source>[a-zA-Z0-9_- ]+).*strategy=(?P<strategy>[a-zA-Z0-9_-]+).*target=(?P<target>[a-zA-Z0-9_-]+).*Group:(?P<group>[a-zA-Z0- 9]+).*Job:(?P<job>[a-zA-Z0-9_-]+).*Namespace:(?P<namespace>[a-zA-Z0-9_-]+)' https://grafana.com/docs/loki/latest/clients/promtail/ Promtail sidecar
  • 50. • apm "prometheus" { driver = "prometheus" config = { address = "http://prometheus.service.consul:9090" } } • target "aws-asg" { driver = "aws-asg" config = { aws_region = "{{ $x := env "attr.platform.aws.placement.availability-zone" }}{{ $length := len $x |subtract 1 }}{{ slice $x 0 $length}}" } } Grow into your platform
  • 51. • scaling "cluster_policy" { policy { cooldown = "2m" evaluation_interval = "1m" check "cpu_allocated_percentage" { source = "prometheus" query = "scalar(sum(nomad_client_allocated_cpu{node_class="hashistack"}*100/(nomad_client_unallocated_cpu{node_class="hashistack"}+nomad_client_allocated_cpu{n ode_class="hashistack"}))/count(nomad_client_allocated_cpu{node_class="hashistack"}))" strategy "target-value" { target = 70 } } target "aws-asg" { dry-run = "false" aws_asg_name = "${client_asg_name}" node_class = "hashistack" node_drain_deadline = "5m” Grow into your platform
  • 52. agent.worker.check_handler: querying source: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value target=a agent.worker.check_handler: calculating new count: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus strategy=target-value ta agent.worker.check_handler: scaling target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-36 internal_plugin.aws-asg: successfully performed and verified scaling out: action=scale_out asg_name=hashistack-n agent.worker.check_handler: successfully submitted scaling action to target: check=mem_allocated_percentage policy_id=bf68649a-d087-2e69-362e-bbe71b5544f7 source=prometheus Observe the autoscaler again
  • 54. Moving it all to the cloud – QED
  • 56. Questions ? The Floor is yours…