Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Walia, Slower)

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 10 Anzeige

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Walia, Slower)

Herunterladen, um offline zu lesen

Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.

Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Walia, Slower) (20)

Anzeige

Weitere von HostedbyConfluent (20)

Aktuellste (20)

Anzeige

Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Walia, Slower)

  1. 1. AUGUST 2020 WE ARE HUMAN KAFKA EXCELLENCE AT SCALE CLOUD, KUBERNETES, INFRASTRUCTURE-AS-CODE
  2. 2. GOAL AND CHALLENGES Operations • Installation • Upgrades • Patches • Rollbacks • Elastic Scaling (up/down) • Fault Tolerance • Disaster Recovery • Security (inflight, at rest) • Logging, Monitoring, Alerting • Secrets Management • … Application Development • Application Onboarding • Creating Topics • Increasing Partitions • Deleting Topics • Security • Monitoring • Best Practices – Producers, Consumers, KSQL, KStreams • … • Pager Duty Self Healing ChallengesGoal
  3. 3. ARCHITECTURE EVOLUTION Event LogMessaging 1990s 2000s 2010s Monolith Service Oriented Architecture Microservices, Events, Containers, Serverless The speed of doing business is increasing… • Application delivery acceleration – CI/CD pipelines, but exponential increase in quantity (but not complexity) of operations work. • Kubernetes – public release in July 2015; Site Reliability Engineering (SRE) best practices published; massive automation in systems administration tasks (self-healing). • An Operator is an automated Site Reliability Engineer.
  4. 4. KUBERNETES OPERATOR PATTERN 1. Operators are custom controllers watching customer resources 2. Allow Infrastructure Engineers and Developers to provide application specific features to manage their site and software. 3. The logic needed to maintain, scale, and heal a specific piece of software is encoded into an operator application that runs as a container in the cluster 4. The code in the operator is responsible for more targeted and advanced health detection and healing that can be achieved via Kubernetes’ generic self-healing 5. Vendors are writing custom operators to make cloud-native management of their software easy https://operatorhub.io/ 6. Confluent has created an Operator for Kafka 7. Other Kafka Operators 1. https://operatorhub.io/operator/banzaicloud-kafka-operator 2. https://operatorhub.io/operator/strimzi-kafka-operator
  5. 5. CONFLUENT OPERATOR › CLOUD NATIVE DEPLOYMENT ON KUBERNETES › DECLARATIVE VS. IMPERATIVE SEMANTICS › IMMUTABLE (CONFLUENT CERTIFIED DOCKER IMAGES) › SELF-HEALING (CONTINUOUS) › INFRASTRUCTURE AS CODE BEST PRACTICES (HELM, YAML) › AUTOMATED DEPLOYMENT › CERTIFIED IMAGES PULLED FROM CONTAINER REGISTRIES › IMAGE SCANNING FOR VULNERABILITY › CI/CD BEST PRACTICES (HELM CHARTS, JENKINS) › AUTOMATED ROLLING UPGRADES (DOWNGRADES) › STOP BROKER › UPGRADE BINARIES › PARTITION LEADER REASSIGNMENT › START BROKER › VERIFY ZERO UNDER-REPLICATED PARTITIONS › ELASTIC SCALING › KUBERNETES METRICS SERVER › SPIN UP NEW BROKERS › SPIN UP NEW CONNECT WORKERS › SECURITY › AUTOMATED CONFIGURATION OF TRUSTSTORES & KEYSTORES › SECRETS MANAGEMENT
  6. 6. GOAL AND CHALLENGES Operations ü Installation ü Upgrades ü Patches ü Rollbacks ü Elastic Scaling (up/down) ü Fault Tolerance • Disaster Recovery ü Security (inflight, at rest) • Logging, Monitoring, Alerting ü Secrets Management • … Application Development • Application Onboarding • Creating Topics • Increasing Partitions • Deleting Topics • Security • Monitoring • Best Practices – Producers, Consumers, KSQL, KStreams • … • Pager Duty Self Healing ChallengesGoal
  7. 7. C O N F I D E N T I A L SELF-SERVICE, AUTOMATION › APPLICATION ONBOARDING › TOPIC MANAGEMENT › PARTITIONS › WORKFLOWS › HOUSEKEEPING › HEALTHCHECK › LIVELINESS, READINESS › NO OFFLINE PARTITIONS › ABILITY TO PRODUCE AND CONSUME › AUTOMATED DR › ACTIVE-PASSIVE, ACTIVE-ACTIVE OR STRETCH › OFFSET SYNCHRONIZATION › PROXY SERVICE › CI/CD PIPELINES › CERTIFIED IMAGES IN CONTAINER REGISTRY › HELM CHARTS › ZERO DOWNTIME OPERATIONS › UPGRADES/PATCHES › DOWNGRADES › RESTARTS › ELASTIC SCALING 7 REST API Web Page Jenkins, Ansible Jira Tickets, Manual http://kafka/API GOVERNANCE › BEST PRACTICES › TOPICS (REPLICATION FACTOR = 3) › PARTITION SIZING › PRODUCERS › ACKS (1, ALL) › ERROR HANDLING (RETRIABLE/NON-RETRIABLE) › CONSUMERS (OFFSET MANAGEMENT) › BROKERS › KSQL, KSTREAMS › NAMING CONVENTIONS › METADATA MANAGEMENT › OWNERSHIP, ATTRIBUTION › ENTITLEMENT MANAGEMENT › RBAC › CAPACITY RESERVATION › QUOTA MANAGEMENT › LOGGING, MONITORING, ALERTING › 2 AM PRODUCTION ISSUE RESOLUTION › LONG TERM DATA PIPELINE OPTIMIZATION › SLACK, EMAIL, PAGERDUTY INTEGRATION
  8. 8. GOAL AND CHALLENGES Operations ü Installation ü Upgrades ü Patches ü Rollbacks ü Elastic Scaling (up/down) ü Fault Tolerance ü Disaster Recovery ü Security (inflight, at rest) ü Logging, Monitoring, Alerting ü Secrets Management ü … Application Development ü Application Onboarding ü Creating Topics ü Increasing Partitions ü Deleting Topics ü Security ü Monitoring ü Best Practices – Producers, Consumers, KSQL, KStreams ü … ü Pager Duty Self Healing ChallengesGoal
  9. 9. Snapshot USAGE, ROI BY TENANT Usage Cost = function(Compute, Storage, Network, Human Effort) Trend over time
  10. 10. AUGUST 2020 WE ARE HUMAN KAFKA EXCELLENCE AT SCALE CLOUD, KUBERNETES, INFRASTRUCTURE-AS-CODE

×