SlideShare ist ein Scribd-Unternehmen logo
1 von 100
Downloaden Sie, um offline zu lesen
Making your
kubernetes-based
log collection reliable
& durable with vector
VECTOR
MAKSIM NABOKIKH
Platform Lead
DISCLAIMER
During this talk preparation,
no Kubernetes clusters were hurt
DISCLAIMER
During this talk preparation,
no Kubernetes clusters were hurt
Just kidding, in reality,
there were ple-e-e-enty of outages
ABOUT
PALARK
We offer all-in-one DevOps-as-a-Service and pick
the best Open Source projects to fulfill our client goals
16 70
Years in Linux,
DevOps & Kubernetes
Managed
Kubernetes clusters
15 90
Awesome
engineers
Tech posts at
blog.palark.com
PLAN
LOGS IN KUBERNETES
Let’s recall what to collect
in Kubernetes
WHAT IS VECTOR
And in which way
it is applicable
PRACTICAL USE
Exciting operating (Ops)
experience cases
1
2
3
LOGS
IN KUBERNETES
LOGS IN KUBERNETES: POD LOGS
Log file location path consists of a pod name, container name, and UID
Format and location of files depends on the CRI settings
Max size and rotation depends on the kubelet settings
kubernetes.io/docs/concepts/cluster-administration/logging/
/var/log/pods
pod-1 pod-2
kubelet
stdout
stderr
stdout
stderr
LOGS IN KUBERNETES: NODE SERVICES
Files in the /var/log directory (probably)
Max size and rotation configured by journald
Format can be anything…
kubernetes.io/docs/concepts/cluster-administration/logging/
containerd kubelet audit logs syslog
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES
kubernetes.io/docs/concepts/cluster-administration/logging/
What we can collect? Source
Pod logs Files
Node services logs Files
Events Kubernetes API
LOGS IN KUBERNETES
kubernetes.io/docs/concepts/cluster-administration/logging/
What we can collect? Source
Pod logs Files
Node services logs Files
Events Kubernetes API
WHAT IS
VECTOR
WHAT IS VECTOR
A lightweight, ultra-fast tool
for building observability pipelines
vector.dev
WHAT IS VECTOR
A lightweight, ultra-fast tool
for building observability pipelines
vector.dev
WHAT IS VECTOR
An open source, efficient tool
for building log collecting pipelines
vector.dev
WHAT IS VECTOR
Vendor agnostic
You do not need to rewrite Vector in Rust
Performance by design and continuous benchmarking
Flexible building block
vector.dev
An open source, efficient tool
for building log collecting pipelines
VECTOR’S ARCHITECTURE
VECTOR’S ARCHITECTURE
Remap
Filter
Aggregate
Collect Transform Send
File
K8s
Socket
9 in total
40 in total 52 in total
…
…
VECTOR’S ARCHITECTURE
Remap
Filter
Aggregate
Collect Transform Send
File
K8s
Socket
9 in total
40 in total 52 in total
Vector Remap
Language (VRL)
VECTOR REMAP LANGUAGE
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
[transforms.sanitize_kubernetes_labels]
type = "remap"
inputs = ["logs"]
source = '''
if exists(.pod_labels."controller-revision-hash") {
del(.pod_labels."controller-revision-hash")
}
if exists(.pod_labels."pod-template-hash") {
del(.pod_labels."pod-template-hash")
}
'''
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
[transforms.sanitize_kubernetes_labels]
type = "remap"
inputs = ["logs"]
source = '''
if exists(.pod_labels."controller-revision-hash") {
del(.pod_labels."controller-revision-hash")
}
if exists(.pod_labels."pod-template-hash") {
del(.pod_labels."pod-template-hash")
}
'''
[transforms.backslash_multiline]
type = "reduce"
inputs = ["logs"]
group_by = ["file", "stream"]
merge_strategies."message" = "concat_newline"
ends_when = '''
matched, err = match(.message, r'[^]$');
if err != null {
false;
} else {
matched;
}
'''
LOG COLLECTING TOPOLOGIES
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
storage
Distributed
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
Distributed Centralized
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
queue storage
Distributed Centralized
Stream
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
queue storage
Distributed Centralized
Stream
LOG COLLECTING TOPOLOGIES
aggregator
storage
aggregator
storage
queue storage
Distributed Centralized
Stream
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
VECTOR IN KUBERNETES
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
shareProcessNamespace: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
shareProcessNamespace: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
PRACTICAL
USE
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
vector 6331 root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6629 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
vector 6331 root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6629 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 20Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
429
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
429
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
HOW TO SOLVE?
1. Tune buffer settings
Blocking (default) Drop Newest
In Memory (default) Disk buffer
Max events 1000 (default) 10000
2. Rule of a thumb
Let logs go out of the node as quick as possible
3. If you brave enough
sysctl -w fs.file-max=1000 (unsafe)
vector.dev/docs/about/under-the-hood/architecture/buffering-model/
CASE #1: NO SPACE LEFT ON THE DEVICE
CASE #2: PROMETHEUS EXPLODED
uid=a uid=b
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
Vector Prometheus
a
b
CASE #2: PROMETHEUS EXPLODED
uid=a uid=b
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
Vector Prometheus
a
b
CASE #2: PROMETHEUS EXPLODED
uid=c uid=d
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
Vector Prometheus
a
b
c
d
CASE #2: PROMETHEUS EXPLODED
uid=c uid=d
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
Vector Prometheus
a
b
c
d
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
metric_relabel_configs:
- regex: 'file'
action: labeldrop
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
metric_relabel_configs:
- regex: 'file'
action: labeldrop
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
c
d
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
vector_component_errors_total
time
7
3
3
errors
4
m
ore
errors
expiration
triggered
3
errors
empty!
This behavior makes
the result of the rate
PromQL function
equal to zero.
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
Patch for Vector
to remove the file label
CASE #2: PROMETHEUS EXPLODED
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
control-plane node
memory consumption
etcd
memory consumption
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
RAM↑ RAM↑
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
RAM↑ RAM↑
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
use_apiserver_cache=true
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: PriorityLevelConfiguration
metadata:
name: limit-list-custom
spec:
type: Limited
limited:
assuredConcurrencyShares: 5
limitResponse:
queuing:
handSize: 4
queueLengthLimit: 50
queues: 16
type: Queue
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: FlowSchema
metadata:
name: limit-list-custom
spec:
priorityLevelConfiguration:
name: limit-list-custom
distinguisherMethod:
type: ByUser
rules:
- resourceRules:
- apiGroups: [""]
clusterScope: true
namespaces: ["*"]
resources: ["pods"]
verbs: ["list", "get"]
subjects:
- kind: ServiceAccount
serviceAccount:
name: ***
namespace: ***
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: PriorityLevelConfiguration
metadata:
name: limit-list-custom
spec:
type: Limited
limited:
assuredConcurrencyShares: 5
limitResponse:
queuing:
handSize: 4
queueLengthLimit: 50
queues: 16
type: Queue
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: FlowSchema
metadata:
name: limit-list-custom
spec:
priorityLevelConfiguration:
name: limit-list-custom
distinguisherMethod:
type: ByUser
rules:
- resourceRules:
- apiGroups: [""]
clusterScope: true
namespaces: ["*"]
resources: ["pods"]
verbs: ["list", "get"]
subjects:
- kind: ServiceAccount
serviceAccount:
name: ***
namespace: ***
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
3. Use kubelet API instead of Kubernetes
Pods metadata can be fetched by requesting the /pods endpoint
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
3. Use kubelet API instead of Kubernetes
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
CONCLUSION
1. Great to build platforms
2. Vector is awesome, seriously, deploy it today
3. Share practical cases and learn together
github.com/werf
github.com/palark
THANK YOU!
Q&A
@nabokihms
maksim.nabokikh@palark.com
OPEN SOURCE
TOOLS
OUR BLOGS AND
SOCIAL MEDIA
CONTACT US
palark.com
twitter.com/palark_com
MAKSIM
NABOKIKH
Platform Lead

Weitere ähnliche Inhalte

Ähnlich wie OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh

LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
OpenShift Origin
 

Ähnlich wie OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh (20)

Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Kubernetes laravel and kubernetes
Kubernetes   laravel and kubernetesKubernetes   laravel and kubernetes
Kubernetes laravel and kubernetes
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
 
Istio Playground
Istio PlaygroundIstio Playground
Istio Playground
 
Kubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of ContainersKubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of Containers
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
DevOpSec_KubernetesOperatorUsingJava.pdf
DevOpSec_KubernetesOperatorUsingJava.pdfDevOpSec_KubernetesOperatorUsingJava.pdf
DevOpSec_KubernetesOperatorUsingJava.pdf
 
JDD2015: Kubernetes - Beyond the basics - Paul Bakker
JDD2015: Kubernetes - Beyond the basics - Paul BakkerJDD2015: Kubernetes - Beyond the basics - Paul Bakker
JDD2015: Kubernetes - Beyond the basics - Paul Bakker
 
Who is afraid of privileged containers ?
Who is afraid of privileged containers ?Who is afraid of privileged containers ?
Who is afraid of privileged containers ?
 
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
DCEU 18: Docker Container Networking
DCEU 18: Docker Container NetworkingDCEU 18: Docker Container Networking
DCEU 18: Docker Container Networking
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
 
Publishing AwsLlambda Logs Into SplunkCloud
Publishing AwsLlambda Logs Into SplunkCloudPublishing AwsLlambda Logs Into SplunkCloud
Publishing AwsLlambda Logs Into SplunkCloud
 
Pro2516 10 things about oracle and k8s.pptx-final
Pro2516   10 things about oracle and k8s.pptx-finalPro2516   10 things about oracle and k8s.pptx-final
Pro2516 10 things about oracle and k8s.pptx-final
 
From Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in SydneyFrom Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in Sydney
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
Cowboy dating with big data
Cowboy dating with big data Cowboy dating with big data
Cowboy dating with big data
 

Kürzlich hochgeladen

Kürzlich hochgeladen (14)

Microsoft Fabric Analytics Engineer (DP-600) Exam Dumps 2024.pdf
Microsoft Fabric Analytics Engineer (DP-600) Exam Dumps 2024.pdfMicrosoft Fabric Analytics Engineer (DP-600) Exam Dumps 2024.pdf
Microsoft Fabric Analytics Engineer (DP-600) Exam Dumps 2024.pdf
 
SaaStr Workshop Wednesday with CEO of Guru
SaaStr Workshop Wednesday with CEO of GuruSaaStr Workshop Wednesday with CEO of Guru
SaaStr Workshop Wednesday with CEO of Guru
 
2024 mega trends for the digital workplace - FINAL.pdf
2024 mega trends for the digital workplace - FINAL.pdf2024 mega trends for the digital workplace - FINAL.pdf
2024 mega trends for the digital workplace - FINAL.pdf
 
Using AI to boost productivity for developers
Using AI to boost productivity for developersUsing AI to boost productivity for developers
Using AI to boost productivity for developers
 
2024-05-15-Surat Meetup-Hyperautomation.pptx
2024-05-15-Surat Meetup-Hyperautomation.pptx2024-05-15-Surat Meetup-Hyperautomation.pptx
2024-05-15-Surat Meetup-Hyperautomation.pptx
 
"I hear you": Moving beyond empathy in UXR
"I hear you": Moving beyond empathy in UXR"I hear you": Moving beyond empathy in UXR
"I hear you": Moving beyond empathy in UXR
 
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINESBIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
 
Abortion Pills Fahaheel ௹+918133066128💬@ Safe and Effective Mifepristion and ...
Abortion Pills Fahaheel ௹+918133066128💬@ Safe and Effective Mifepristion and ...Abortion Pills Fahaheel ௹+918133066128💬@ Safe and Effective Mifepristion and ...
Abortion Pills Fahaheel ௹+918133066128💬@ Safe and Effective Mifepristion and ...
 
Modernizing The Transport System:Dhaka Metro Rail
Modernizing The Transport System:Dhaka Metro RailModernizing The Transport System:Dhaka Metro Rail
Modernizing The Transport System:Dhaka Metro Rail
 
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINESBIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
 
Databricks Machine Learning Associate Exam Dumps 2024.pdf
Databricks Machine Learning Associate Exam Dumps 2024.pdfDatabricks Machine Learning Associate Exam Dumps 2024.pdf
Databricks Machine Learning Associate Exam Dumps 2024.pdf
 
STM valmiusseminaari 26-04-2024 PUUMALAINEN Ajankohtaista kansainvälisestä yh...
STM valmiusseminaari 26-04-2024 PUUMALAINEN Ajankohtaista kansainvälisestä yh...STM valmiusseminaari 26-04-2024 PUUMALAINEN Ajankohtaista kansainvälisestä yh...
STM valmiusseminaari 26-04-2024 PUUMALAINEN Ajankohtaista kansainvälisestä yh...
 
TSM unit 5 Toxicokinetics seminar by Ansari Aashif Raza.pptx
TSM unit 5 Toxicokinetics seminar by  Ansari Aashif Raza.pptxTSM unit 5 Toxicokinetics seminar by  Ansari Aashif Raza.pptx
TSM unit 5 Toxicokinetics seminar by Ansari Aashif Raza.pptx
 
The Concession of Asaba International Airport: Balancing Politics and Policy ...
The Concession of Asaba International Airport: Balancing Politics and Policy ...The Concession of Asaba International Airport: Balancing Politics and Policy ...
The Concession of Asaba International Airport: Balancing Politics and Policy ...
 

OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh