Technical breakout during Confluent’s streaming event in Munich, presented by Sam Julian, Chief Cloud Engineer at E.On SE. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
5. Plug & play data connection
• ready-to-use data connectors and
Libraries help to rapidly getting
started
• versatile templates for custom
data sources can fit to any data
6. Adaptive capacity
• storage sizes of 50GB up to
50TB, extend or downsize to fit to
your needs or save costs
• worker node counts of 1 up to
100 with a choice of node sizes,
ready to serve any data
processing needs
7. Self-healing & auto-patching
• worker nodes automatically apply
security patches
• probing and self-healing
capabilities for worker nodes and
deployed workers actively protect
against downtime
9. Emergency restoration pipelines
• rebuild or update configuration in
case of an emergency recovery
scenario
• recover whole cluster
configuration, or precisely update
affected parts only for least
intrusive operation
• fast and reliable guided transport
through pipelines
10. Threat protection
• cloud-native threat advisory help to
detect and prevent intrusion
• secure private network setup behind
firewall with configuration and rules as
code walls off most network breaches
• code and container vulnerabilities
scans with self-updated CVE*
databases contributes to strong
governance of threats from inside
• encrypted storage provide defence
against data leak
* https://cve.mitre.org
11. Seamless developer experience
• improved developer productivity
with seamlessly integrated
developer tools including
repositories, CI/CD and code
quality assistance
• always up to date with latest
contemporary tool versions
12. Realtime traffic information
• integral dashboards to monitor
data traffic, like latency, cpu load,
storage or memory usage
• adjustable alerts keep you up to
date on important metrics and
help to prevent downtime
18. AKS-GKE-Helm
• fast and customisable via the values.yaml
• official top-level CNFN project with large
community
• difficult to know exactly what the chart is
changing/deploying without
• finding the chart source code
• making sense of or rendering all the
template to k8s resource files, investigating
the docker files etc etc
• added complexity, manage and secure the Tiller
component in the cluster (local tiller
https://github.com/adamreese/helm-local or Helm
3.0 tillerless)
• another tool is still required for managing the
cluster outside of helm
19. Apps cluster
Control Center
Schema registry
Rest Proxy
service
Prometheus
Ingress
controller
Let’s encrypt
controller
Kubernetes Cluster
(namespace per service)
OAuth2
proxy
Rest proxy
Producers
(Kafka connect)
Confluent
replicator
Consumers
(Kafka connect)
config
and
secrets
KSQL
microservices
20.
21. Kafka cluster
Zookeeper
(Stateful set,
3 or 5 instances)
Kafka
(Statefule set,
3 - n instances)
Zookeeper
service
Kafka
service
coreDNS
Prometheus
Ingress
controller
Let’s encrypt
controller
Kubernetes Cluster
(namespace cds)
Confluent auto data
rebalancer
(kubernetes cron
job)
CA (certificates for
brokers and clients)
22.
23. Terraform
• cloud agnostic
• shareable modules (kafka
cluster in the box)
• template all the cloud and
supporting infra (clusters,
nsgs, dns, networking etc)
25. Replicator
• data is copied from specified topics
from Kafka 1 to Kafka 2 via the
Replicator (one or more instances)
• the topic configuration (number of
partitions, replication factor) is
preserved from source todestination
• there must be at least as many
brokers in the destination cluster as
the maximum replication factor used
• serialization and deserialization is
done via SchemaRegistry, located on
Kafka 2