SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Cassandra Operator with
Yelp PaaSTA
Raghavendra Prabhu & Matthew Mead-Briggs
JetStack Connect Nov 2019
Yelp’s Mission
Connecting people with great
local businesses.
Powered by Cassandra
Your Reviews
1
2
1
Waitlist for diners2
Overview
● Why Kubernetes
● Status quo with Cassandra
● Cassandra Operator
● Pain Points
● Future
What is PaaSTA
github.com/yelp/paasta github.com/yelp/clusterman
Why K8s
● Why Kubernetes
○ Why not persist with plain EC2
● Why DIY the operator
Status Quo of Cassandra @ Yelp
● What is Cassandra
● Roughly a hundred clusters on AWS ASG
● New cluster launches with k8s
○ Batteries included: Good defaults, TLS
● Migration strategy in place
○ Backward compatible discovery mechanism
● K8s clusters deployed on spot fleet as well.
● Local k8s development cluster with
https://github.com/kubernetes-sigs/kind
Cassandra
Operator
us-west-2
us-west-1
us-east-1
Yelp Cassandra @ 100000 ft
Multi-region Cluster
us-west-2
us-west-1
us-east-1
Yelp Cassandra @ 100000 ft + Operator
Multi-region Cluster
State of Cassandra Operator at Yelp
● Cassandra cluster specification
● What is in a Cassandra Pod
● Storage aka State
● Reconciliation
○ StatefulSet
○ Core event loop
● Deployment
The Recipe aka the Cluster Spec
Smartstack Seed Provider
Synapse
Client
Service
HAProxy
Nerve
Service
ZK
Cassandra Pod
● Cassandra container
● Sidecars
○ Hacheck for Nerve (Smartstack)
○ Cron Jobs
○ Sensu alerting
● Node: metrics collection
○ Puppet
To sidecar or not to sidecar
● Emit data to host/external
service
● Collect data from process in
hosts namespace
● Sidecar collects data
Storage aka State
● Dynamic Provisioning
○ StorageClass per cluster
○ “Compute follows Data”
■ Immediate Volume Binding Mode
■ Stripe cluster across AZs
● EBS for Cassandra
○ Clear separation of stateful and stateless
○ Makes it easy to delete statefulsets
○ Bouncing the cluster is also quite fast
Storage aka State
Hash-based reconciliation
● Compute hash of Pod Template
● Attach as label to the StatefulSet
● Compare label on existing StatefulSet to newly computed
Cluster Readiness
● Cluster ready = AND(pod readiness) over all
○ Service Readiness
● Readiness per pod: UN in Cassandra
● Liveness check: U for Cassandra
● Hooks
○ Draining
Locking
● Clusters are multi-region
● Operators are per-region
● Non-federated setup
● Coordination with etcd leases
● LeaseID stored in Custom Resource Status
IAM roles
● For cassandra we need access to S3 and dynamoDB for
backups
● https://github.com/uswitch/kiam
● Proxies the EC2 metadata service for Pods
● Allows us to lift and shift IAM profiles from EC2
PaaSTA Secrets Support
● PaaSTA on Mesos already supports secrets
● User friendly cli to “create” secrets
● Use Vault’s transit endpoint to encrypt
● Sync these secrets into kubernetes Secrets
● Cassandra is using these for TLS secrets
$ echo "SOMETHINGSECRET" | paasta secret add -s cassandra_k8s -n
secret-name-here -c norcal-devc
Deployment / PaaSTA integration
Migration
● Launching new clusters in k8s is easy
● Migration of existing clusters without downtime is hard
● Unified discovery with smartstack
● How: Add k8s nodes to existing Cassandra cluster
● We have migrated a few already!
Pain Points
Pain points
● Client-side Validation
● Statefulset inflexibility with changes
○ Manual intervention for stuck statefulset deployments
○ Resizing the Persistent Volume
○ Orphaned EBS volumes
● Unready/Dead Nodes, Spot fleet and garbage collection
Heading towards >
● Load-based autoscaling for Cassandra pods
● EBS snapshotting automation
● Fleet autoscaling with Clusterman
● Better integration tests for the operator and for the clusters!
● Production clusters on AWS spot fleet
● More workloads on kubernetes (just started our Kafka operator)
Conclusion
We're Hiring!
www.yelp.com/careers/
Questions?
Credits
● Apache cassandra logo
● https://kubernetes.io/
● https://etcd.io/
● https://aws.amazon.com/architecture/icons/
● https://www.yelp.com/brand
● https://thenounproject.com/
● https://commons.wikimedia.org/wiki/File:Back-to-the-future-logo.svg
● https://www.writeups.org/star-trek-brent-spiner-data/
@YelpEngineering
fb.com/YelpEngineers
engineeringblog.yelp.com
github.com/yelp

Weitere ähnliche Inhalte

Was ist angesagt?

OpenStack HA
OpenStack HAOpenStack HA
OpenStack HAtcp cloud
 
Building a Data Plane with K8ssandra, Apache Cassandra on Kubernetes
Building a Data Plane with K8ssandra, Apache Cassandra on KubernetesBuilding a Data Plane with K8ssandra, Apache Cassandra on Kubernetes
Building a Data Plane with K8ssandra, Apache Cassandra on KubernetesChristopher Bradford
 
An approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stackAn approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stackArthur Berezin
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupFaheem Memon
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...ScyllaDB
 
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Raghavendra Prabhu
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut:  Orchestrating  Percona XtraDB Cluster with KubernetesClusternaut:  Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut: Orchestrating  Percona XtraDB Cluster with KubernetesRaghavendra Prabhu
 
Kuryr kubernetes: the seamless path to adding pods to your datacenter networking
Kuryr kubernetes: the seamless path to adding pods to your datacenter networkingKuryr kubernetes: the seamless path to adding pods to your datacenter networking
Kuryr kubernetes: the seamless path to adding pods to your datacenter networkingAntoni Segura Puimedon
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Dmitry Skaredov
 
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster Management
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster ManagementApache Cassandra Lunch #52: Airflow and Cassandra for Cluster Management
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster ManagementAnant Corporation
 
Data Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsData Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsAnant Corporation
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1aspyker
 
Hybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackHybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackKublr
 
OpenStack What's New in Essex
OpenStack What's New in Essex OpenStack What's New in Essex
OpenStack What's New in Essex Vish Abrams
 

Was ist angesagt? (20)

OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
Topologies of OpenStack
Topologies of OpenStackTopologies of OpenStack
Topologies of OpenStack
 
Building a Data Plane with K8ssandra, Apache Cassandra on Kubernetes
Building a Data Plane with K8ssandra, Apache Cassandra on KubernetesBuilding a Data Plane with K8ssandra, Apache Cassandra on Kubernetes
Building a Data Plane with K8ssandra, Apache Cassandra on Kubernetes
 
An approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stackAn approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stack
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetup
 
Kuryr + open shift
Kuryr + open shiftKuryr + open shift
Kuryr + open shift
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Intro to Kubernetes
Intro to KubernetesIntro to Kubernetes
Intro to Kubernetes
 
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut:  Orchestrating  Percona XtraDB Cluster with KubernetesClusternaut:  Orchestrating  Percona XtraDB Cluster with Kubernetes
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
 
Kuryr kubernetes: the seamless path to adding pods to your datacenter networking
Kuryr kubernetes: the seamless path to adding pods to your datacenter networkingKuryr kubernetes: the seamless path to adding pods to your datacenter networking
Kuryr kubernetes: the seamless path to adding pods to your datacenter networking
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1
 
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster Management
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster ManagementApache Cassandra Lunch #52: Airflow and Cassandra for Cluster Management
Apache Cassandra Lunch #52: Airflow and Cassandra for Cluster Management
 
Data Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsData Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API calls
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1
 
Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016
 
Hybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackHybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stack
 
OpenStack What's New in Essex
OpenStack What's New in Essex OpenStack What's New in Essex
OpenStack What's New in Essex
 

Ähnlich wie Cassandra Operator with Yelp PaaSTA

Orchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesOrchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesRaghavendra Prabhu
 
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesOrchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesRaghavendra Prabhu
 
Apache Cassandra Lunch #93: K8ssandra on Digital Ocean
Apache Cassandra Lunch #93: K8ssandra on Digital OceanApache Cassandra Lunch #93: K8ssandra on Digital Ocean
Apache Cassandra Lunch #93: K8ssandra on Digital OceanAnant Corporation
 
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to KubernetesApache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to KubernetesAnant Corporation
 
Apache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET DevelopersApache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET DevelopersAnant Corporation
 
Taskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerTaskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerRaghavendra Prabhu
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
LINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native WorldLINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native WorldLINE Corporation
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Apache Spark on Kubernetes
Apache Spark on KubernetesApache Spark on Kubernetes
Apache Spark on Kubernetesharidasnss
 
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAP
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAPCloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAP
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAPKrishna-Kumar
 
Intro to creating kubernetes operators
Intro to creating kubernetes operators Intro to creating kubernetes operators
Intro to creating kubernetes operators Juraj Hantak
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsRightScale
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Cédrick Lunven
 
Dok Talks #119 - Cloud-Native Data Pipelines
Dok Talks #119 - Cloud-Native Data PipelinesDok Talks #119 - Cloud-Native Data Pipelines
Dok Talks #119 - Cloud-Native Data PipelinesDoKC
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataDoKC
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataDoKC
 

Ähnlich wie Cassandra Operator with Yelp PaaSTA (20)

Orchestrating Cassandra with Kubernetes
Orchestrating Cassandra with KubernetesOrchestrating Cassandra with Kubernetes
Orchestrating Cassandra with Kubernetes
 
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesOrchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
 
Apache Cassandra Lunch #93: K8ssandra on Digital Ocean
Apache Cassandra Lunch #93: K8ssandra on Digital OceanApache Cassandra Lunch #93: K8ssandra on Digital Ocean
Apache Cassandra Lunch #93: K8ssandra on Digital Ocean
 
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to KubernetesApache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
 
CL 121
CL 121CL 121
CL 121
 
Apache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET DevelopersApache Cassandra Lunch #64: Cassandra for .NET Developers
Apache Cassandra Lunch #64: Cassandra for .NET Developers
 
Taskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task ManagerTaskerman: A Distributed Cluster Task Manager
Taskerman: A Distributed Cluster Task Manager
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
LINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native WorldLINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native World
 
Cassandra
CassandraCassandra
Cassandra
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Apache Spark on Kubernetes
Apache Spark on KubernetesApache Spark on Kubernetes
Apache Spark on Kubernetes
 
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAP
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAPCloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAP
Cloud Native Use Cases / Case Studies - KubeCon 2019 San Diego - RECAP
 
Intro to creating kubernetes operators
Intro to creating kubernetes operators Intro to creating kubernetes operators
Intro to creating kubernetes operators
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
 
Dok Talks #119 - Cloud-Native Data Pipelines
Dok Talks #119 - Cloud-Native Data PipelinesDok Talks #119 - Cloud-Native Data Pipelines
Dok Talks #119 - Cloud-Native Data Pipelines
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent data
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent data
 

Mehr von Raghavendra Prabhu

Safe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitSafe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitRaghavendra Prabhu
 
Pass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondPass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondRaghavendra Prabhu
 
Taskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerTaskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerRaghavendra Prabhu
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesRaghavendra Prabhu
 
Working from home - fun, facts and scares!
Working from home -  fun, facts and scares!Working from home -  fun, facts and scares!
Working from home - fun, facts and scares!Raghavendra Prabhu
 
Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Raghavendra Prabhu
 
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Raghavendra Prabhu
 
Dock'em: Distributed Systems Testing with NetEm and Docker
Dock'em: Distributed Systems Testing with NetEm and Docker Dock'em: Distributed Systems Testing with NetEm and Docker
Dock'em: Distributed Systems Testing with NetEm and Docker Raghavendra Prabhu
 
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Raghavendra Prabhu
 
Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Raghavendra Prabhu
 
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentCorpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentRaghavendra Prabhu
 
Corpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testCorpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testRaghavendra Prabhu
 
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Raghavendra Prabhu
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRaghavendra Prabhu
 
ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...Raghavendra Prabhu
 
Percona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingPercona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingRaghavendra Prabhu
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedRaghavendra Prabhu
 

Mehr von Raghavendra Prabhu (20)

Safe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and ProfitSafe and Fast Automation on AWS for Fun and Profit
Safe and Fast Automation on AWS for Fun and Profit
 
Pass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and BeyondPass Elk: CAP Theorem since 90s and Beyond
Pass Elk: CAP Theorem since 90s and Beyond
 
Taskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerTaskerman - a distributed cluster task manager
Taskerman - a distributed cluster task manager
 
NUMA and Java Databases
NUMA and Java DatabasesNUMA and Java Databases
NUMA and Java Databases
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and Opportunities
 
Working from home - fun, facts and scares!
Working from home -  fun, facts and scares!Working from home -  fun, facts and scares!
Working from home - fun, facts and scares!
 
Securing databases with systemd for containers and services
Securing databases with systemd for containers and services Securing databases with systemd for containers and services
Securing databases with systemd for containers and services
 
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
Corpus collapsum: Partition tolerance testing of Galera with Docker and NetEm
 
Dock'em: Distributed Systems Testing with NetEm and Docker
Dock'em: Distributed Systems Testing with NetEm and Docker Dock'em: Distributed Systems Testing with NetEm and Docker
Dock'em: Distributed Systems Testing with NetEm and Docker
 
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...
 
Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body Jutsu or Dô: Open documentation: continuous process than a body
Jutsu or Dô: Open documentation: continuous process than a body
 
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentCorpus collapsum: Partition tolerance of Galera in a noisy high load environment
Corpus collapsum: Partition tolerance of Galera in a noisy high load environment
 
Corpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to testCorpus collapsum: Partition tolerance of Galera put to test
Corpus collapsum: Partition tolerance of Galera put to test
 
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profit
 
ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...ACIDic Clusters: Review of current relation databases with synchronous replic...
ACIDic Clusters: Review of current relation databases with synchronous replic...
 
Percona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testingPercona XtraDB Cluster before every release: Glimpse into CI testing
Percona XtraDB Cluster before every release: Glimpse into CI testing
 
Feed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysedFeed me more: MySQL Memory analysed
Feed me more: MySQL Memory analysed
 
Xtrabackup and FTWRL
Xtrabackup and FTWRLXtrabackup and FTWRL
Xtrabackup and FTWRL
 
MySQL-and-virtualization
MySQL-and-virtualizationMySQL-and-virtualization
MySQL-and-virtualization
 

Kürzlich hochgeladen

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 

Kürzlich hochgeladen (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Cassandra Operator with Yelp PaaSTA

  • 1. Cassandra Operator with Yelp PaaSTA Raghavendra Prabhu & Matthew Mead-Briggs JetStack Connect Nov 2019
  • 2. Yelp’s Mission Connecting people with great local businesses.
  • 3. Powered by Cassandra Your Reviews 1 2 1 Waitlist for diners2
  • 4. Overview ● Why Kubernetes ● Status quo with Cassandra ● Cassandra Operator ● Pain Points ● Future
  • 5. What is PaaSTA github.com/yelp/paasta github.com/yelp/clusterman
  • 6. Why K8s ● Why Kubernetes ○ Why not persist with plain EC2 ● Why DIY the operator
  • 7. Status Quo of Cassandra @ Yelp ● What is Cassandra ● Roughly a hundred clusters on AWS ASG ● New cluster launches with k8s ○ Batteries included: Good defaults, TLS ● Migration strategy in place ○ Backward compatible discovery mechanism ● K8s clusters deployed on spot fleet as well. ● Local k8s development cluster with https://github.com/kubernetes-sigs/kind
  • 9. us-west-2 us-west-1 us-east-1 Yelp Cassandra @ 100000 ft Multi-region Cluster
  • 10. us-west-2 us-west-1 us-east-1 Yelp Cassandra @ 100000 ft + Operator Multi-region Cluster
  • 11. State of Cassandra Operator at Yelp ● Cassandra cluster specification ● What is in a Cassandra Pod ● Storage aka State ● Reconciliation ○ StatefulSet ○ Core event loop ● Deployment
  • 12. The Recipe aka the Cluster Spec
  • 14.
  • 15. Cassandra Pod ● Cassandra container ● Sidecars ○ Hacheck for Nerve (Smartstack) ○ Cron Jobs ○ Sensu alerting ● Node: metrics collection ○ Puppet
  • 16. To sidecar or not to sidecar ● Emit data to host/external service ● Collect data from process in hosts namespace ● Sidecar collects data
  • 17. Storage aka State ● Dynamic Provisioning ○ StorageClass per cluster ○ “Compute follows Data” ■ Immediate Volume Binding Mode ■ Stripe cluster across AZs ● EBS for Cassandra ○ Clear separation of stateful and stateless ○ Makes it easy to delete statefulsets ○ Bouncing the cluster is also quite fast
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Hash-based reconciliation ● Compute hash of Pod Template ● Attach as label to the StatefulSet ● Compare label on existing StatefulSet to newly computed
  • 24. Cluster Readiness ● Cluster ready = AND(pod readiness) over all ○ Service Readiness ● Readiness per pod: UN in Cassandra ● Liveness check: U for Cassandra ● Hooks ○ Draining
  • 25. Locking ● Clusters are multi-region ● Operators are per-region ● Non-federated setup ● Coordination with etcd leases ● LeaseID stored in Custom Resource Status
  • 26. IAM roles ● For cassandra we need access to S3 and dynamoDB for backups ● https://github.com/uswitch/kiam ● Proxies the EC2 metadata service for Pods ● Allows us to lift and shift IAM profiles from EC2
  • 27. PaaSTA Secrets Support ● PaaSTA on Mesos already supports secrets ● User friendly cli to “create” secrets ● Use Vault’s transit endpoint to encrypt ● Sync these secrets into kubernetes Secrets ● Cassandra is using these for TLS secrets $ echo "SOMETHINGSECRET" | paasta secret add -s cassandra_k8s -n secret-name-here -c norcal-devc
  • 28. Deployment / PaaSTA integration
  • 29.
  • 30.
  • 31. Migration ● Launching new clusters in k8s is easy ● Migration of existing clusters without downtime is hard ● Unified discovery with smartstack ● How: Add k8s nodes to existing Cassandra cluster ● We have migrated a few already!
  • 33. Pain points ● Client-side Validation ● Statefulset inflexibility with changes ○ Manual intervention for stuck statefulset deployments ○ Resizing the Persistent Volume ○ Orphaned EBS volumes ● Unready/Dead Nodes, Spot fleet and garbage collection
  • 34.
  • 35. Heading towards > ● Load-based autoscaling for Cassandra pods ● EBS snapshotting automation ● Fleet autoscaling with Clusterman ● Better integration tests for the operator and for the clusters! ● Production clusters on AWS spot fleet ● More workloads on kubernetes (just started our Kafka operator)
  • 38.
  • 40. Credits ● Apache cassandra logo ● https://kubernetes.io/ ● https://etcd.io/ ● https://aws.amazon.com/architecture/icons/ ● https://www.yelp.com/brand ● https://thenounproject.com/ ● https://commons.wikimedia.org/wiki/File:Back-to-the-future-logo.svg ● https://www.writeups.org/star-trek-brent-spiner-data/