XebiCon'18 - Passage à l'échelle de mes applications Kafka-Streams

@Xebiconfr #Xebicon18 @LoicMDivad
Build the future
Scale in / scale out with Kafka-Streams and Kubernetes
Loïc DIVAD
Data Engineer @ Xebia France

@Xebiconfr #Xebicon18 @LoicMDivad2
APP

APP
APP
APP

APP
APP
APP
APP
APP

Auto-scaling
applied to streaming apps
5

Loïc DIVAD
Developer @XebiaFr
(also #Data Engineer, #Spark Trainer, @DataXDay #Organiser, #Writer@blog.xebia.fr, #DataLover )
@LoicMDivad
7

Streaming Apps

Kafka clients and the consumer polling system
APP
9

APP
10

APP
11

APP
12

APP
13

Kafka-Streams and the consumer protocol
topic-partition-0
topic-partition-1
topic-partition-2
topic-partition-N
APP
● Every topic in Kafka is split into one or more
partitions
● All the streaming tasks are executed through
one or multiple threads of the same instance
15

topic-partition-0
topic-partition-1
topic-partition-2
topic-partition-N
APP
APP
● Consumers from the same consumer group
cooperate to consume data from topics.
● Every instance by joining the group triggers a
partition rebalance.
16

topic-partition-0
topic-partition-1
topic-partition-2
topic-partition-N
APP
APP
APP
APP
● The maximum parallelism is determined by the
number of partitions of the input topic(s)
17

Container Orchestration

Container Orchestration: K8s or the state of the art
➔ Source: Kubernetes.io Documentation
19

K8s: Support for custom metrics
21
kind: Deployment # deployment.yaml
#...
template:
containers:
- name: streaming-app
# ...
- name: prometheus-to-sd
# ... adapter.yaml
- name: custom-metrics-sd-adapter
Your Streaming App
Prometheus to Stackdriver
https://gcr.io/google-containers/prometheus-to-sd
Metrics Server
https://gcr.io/google-containers/custom-metrics-stackdriver-adapter
JMX
metrics in a
Prometheus
format
Stackdriver

# jmx-exporter.conf
---
global:
scrape_interval: 1s
evaluation_interval: 1s
rules:
- pattern: "kafka.consumer<type=..., topic=GAME-FRAME-RS, partition=(.*)><>(.*):(.*)"
labels: { partition: $2, topic: GAME-FRAME-RS, metric: $3 }
name: "consumer_lag_game_frame_rs"
type: GAUGE
- pattern: "kafka.consumer<type=..., topic=GAME-FRAME-RQ, partition=(.*)><>(.*):(.*)"
labels: { partition: $2, topic: GAME-FRAME-RQ, metric: $3 }
name: "consumer_lag_game_frame_rq"
type: GAUGE
22

23
1
W ,
D f
lolo ➜ ./gradlew dockerPush
<=========----> 73% EXECUTING [2s]
> :docker …
BUILD SUCCESSFUL in 14s
10 actionable tasks: 5 executed, 5 up-to-date

<=========----> 73% EXECUTING [2s]
> :docker …
lolo ➜ terraform apply
…
+ google_container_cluster.primary
24
2
E
!
U f
f … f

<=========----> 73% EXECUTING [2s]
> :docker …
…
lolo ➜ kubectl create -f deployment.yaml
25
3
W
f , ,

<=========----> 73% EXECUTING [2s]
> :docker …
…
lolo ➜ kubectl get pods
prometheus-to-sd kstreams-app
26
4
E !

<=========----> 73% EXECUTING [2s]
> :docker …
…
27

<=========----> 73% EXECUTING [2s]
> :docker …
…
28

<=========----> 73% EXECUTING [2s]
> :docker …
…
29

30
kind: Deployment # deployment.yaml
#...
template:
containers:
- name: streaming-app
# ...
- name: prometheus-to-sd
# ... adapter.yaml
- name: custom-metrics-sd-adapter
Your Streaming App
Prometheus to Stackdriver
https://gcr.io/google-containers/prometheus-to-sd
Metrics Server
https://gcr.io/google-containers/custom-metrics-stackdriver-adapter
Stackdriver

K8s: Horizontal Pod Autoscaler
32

- Kubernetes Resource
- Periodically adjusts the number of replicas
- Base on CPU usage in autoscaling/v1
- Memory and custom metrics are covered by the
autoscaling/v2beta1
- Use the metrics.k8s.io API through a metric server
33

Scale In / scale out with kafka-streams and k8s
34

Scale In / scale out with kafka-streams and k8s
36

CONCLUSION
States migration, changelog compaction, topology
upgrades and k8s StateFull Sets adoption are the
next challenges to ease auto-scaling
BUILD THE FUTURE
1. Kafka-Streams exposes relevant metrics
related to stream processing
2. Consumer-lag is one of the key metrics to
monitor in real time application
3. The cloud native trends brings a set of
powerful tools on which the Kafka
community keep a close look
37

MERCI
38

ANNEXES
41

The Horizontal Pod Autoscaler algorithm depends on the current metric value and replica number
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
➢ A ratio of two will double the number of intances within the respect of maxReplicas
➢ By using targetAverageValue, the metric is computed by taking the average of the given metric across all Pods
The number of replicas may fluctuating frequently due to the dynamic nature of the metrics, it’s called trashing
➢ --horizontal-pod-autoscaler-downscale-delay (default 5m0s)
➢ --horizontal-pod-autoscaler-upscale-delay (default 3m0s)
Note: Both Kafka-Streams topology modification and HPA makes rolling update imposible
HPA & thrashing: “Should I stay or should I Go?”
42

Kafka-Streams & persistent storage: “Let’s talk about states baby”
43
Streaming apps

44

now supports more than
45
https://www.confluent.io/blog/apache-kafka-supports-200k-partitions-per-cluster
200K partitions

“Everything is awesome, when you're living in a THE CLOUD”
46

Use Case - King Of Fighters: The combos sessionization
47
Streaming
App
Correlate
Flatten
Decode
Group
Produce Back
Key => {
"ts":1542609460412,
"machine":"903071",
"zone":"AU"
}
Value => {
"bytes":[
"c3ff8ab19d00d9e5",
"e3ff8c72b600d9e5"
]}
[{
"impact":0,
"key":"X",
"direction":"DOWN",
"type":"Missed",
"level":"Pro",
"game":"Neowave"
}, ...]

Links and references
●
●
●
●
●
●
48

With special thanks to:
49

Auto-scaling
applied to streaming apps
50

XebiCon'18 - Passage à l'échelle de mes applications Kafka-Streams

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie XebiCon'18 - Passage à l'échelle de mes applications Kafka-Streams

Ähnlich wie XebiCon'18 - Passage à l'échelle de mes applications Kafka-Streams (20)

Mehr von Publicis Sapient Engineering

Mehr von Publicis Sapient Engineering (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

XebiCon'18 - Passage à l'échelle de mes applications Kafka-Streams