SlideShare ist ein Scribd-Unternehmen logo
1 von 100
Downloaden Sie, um offline zu lesen
Making your
kubernetes-based
log collection reliable
& durable with vector
VECTOR
MAKSIM NABOKIKH
Platform Lead
DISCLAIMER
During this talk preparation,
no Kubernetes clusters were hurt
DISCLAIMER
During this talk preparation,
no Kubernetes clusters were hurt
Just kidding, in reality,
there were ple-e-e-enty of outages
ABOUT
PALARK
We offer all-in-one DevOps-as-a-Service and pick
the best Open Source projects to fulfill our client goals
16 70
Years in Linux,
DevOps & Kubernetes
Managed
Kubernetes clusters
15 90
Awesome
engineers
Tech posts at
blog.palark.com
PLAN
LOGS IN KUBERNETES
Let’s recall what to collect
in Kubernetes
WHAT IS VECTOR
And in which way
it is applicable
PRACTICAL USE
Exciting operating (Ops)
experience cases
1
2
3
LOGS
IN KUBERNETES
LOGS IN KUBERNETES: POD LOGS
Log file location path consists of a pod name, container name, and UID
Format and location of files depends on the CRI settings
Max size and rotation depends on the kubelet settings
kubernetes.io/docs/concepts/cluster-administration/logging/
/var/log/pods
pod-1 pod-2
kubelet
stdout
stderr
stdout
stderr
LOGS IN KUBERNETES: NODE SERVICES
Files in the /var/log directory (probably)
Max size and rotation configured by journald
Format can be anything…
kubernetes.io/docs/concepts/cluster-administration/logging/
containerd kubelet audit logs syslog
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES: EVENTS
Can only be collected from the Kubernetes API
Can be collected as either logs, metrics, or traces
kubernetes.io/docs/concepts/cluster-administration/logging/
apiVersion: v1
kind: Event
count: 1
metadata:
name: standard-worker-1.178264e1185b006f
namespace: default
reason: RegisteredNode
firstTimestamp: '2023-09-06T19:08:47Z'
lastTimestamp: '2023-09-06T19:08:47Z'
involvedObject:
apiVersion: v1
kind: Node
name: standard-worker-1
uid: 50fb55c5-d97e-4851-85c6-187465154db6
message: 'Registered Node standard-worker-1 in Controller'
LOGS IN KUBERNETES
kubernetes.io/docs/concepts/cluster-administration/logging/
What we can collect? Source
Pod logs Files
Node services logs Files
Events Kubernetes API
LOGS IN KUBERNETES
kubernetes.io/docs/concepts/cluster-administration/logging/
What we can collect? Source
Pod logs Files
Node services logs Files
Events Kubernetes API
WHAT IS
VECTOR
WHAT IS VECTOR
A lightweight, ultra-fast tool
for building observability pipelines
vector.dev
WHAT IS VECTOR
A lightweight, ultra-fast tool
for building observability pipelines
vector.dev
WHAT IS VECTOR
An open source, efficient tool
for building log collecting pipelines
vector.dev
WHAT IS VECTOR
Vendor agnostic
You do not need to rewrite Vector in Rust
Performance by design and continuous benchmarking
Flexible building block
vector.dev
An open source, efficient tool
for building log collecting pipelines
VECTOR’S ARCHITECTURE
VECTOR’S ARCHITECTURE
Remap
Filter
Aggregate
Collect Transform Send
File
K8s
Socket
9 in total
40 in total 52 in total
…
…
VECTOR’S ARCHITECTURE
Remap
Filter
Aggregate
Collect Transform Send
File
K8s
Socket
9 in total
40 in total 52 in total
Vector Remap
Language (VRL)
VECTOR REMAP LANGUAGE
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
[transforms.sanitize_kubernetes_labels]
type = "remap"
inputs = ["logs"]
source = '''
if exists(.pod_labels."controller-revision-hash") {
del(.pod_labels."controller-revision-hash")
}
if exists(.pod_labels."pod-template-hash") {
del(.pod_labels."pod-template-hash")
}
'''
VECTOR REMAP LANGUAGE
[transforms.filter_severity]
type = "filter"
inputs = ["logs"]
condition = '.severity != "info"'
[transforms.sanitize_kubernetes_labels]
type = "remap"
inputs = ["logs"]
source = '''
if exists(.pod_labels."controller-revision-hash") {
del(.pod_labels."controller-revision-hash")
}
if exists(.pod_labels."pod-template-hash") {
del(.pod_labels."pod-template-hash")
}
'''
[transforms.backslash_multiline]
type = "reduce"
inputs = ["logs"]
group_by = ["file", "stream"]
merge_strategies."message" = "concat_newline"
ends_when = '''
matched, err = match(.message, r'[^]$');
if err != null {
false;
} else {
matched;
}
'''
LOG COLLECTING TOPOLOGIES
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
storage
Distributed
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
Distributed Centralized
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
queue storage
Distributed Centralized
Stream
LOG COLLECTING TOPOLOGIES
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
aggregator
storage
aggregator
storage
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
queue storage
Distributed Centralized
Stream
LOG COLLECTING TOPOLOGIES
aggregator
storage
aggregator
storage
queue storage
Distributed Centralized
Stream
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
log-shipper
VECTOR IN KUBERNETES
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
shareProcessNamespace: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
VECTOR IN KUBERNETES
github.com/deckhouse/deckhouse/blob/main/modules/460-log-shipper/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
volumes:
- name: var-log
hostPath:
path: /var/log/
- name: vector-data-dir
hostPath:
path: /mnt/vector-data
- name: localtime
hostPath:
path: /etc/localtime
volumeMounts:
- name: var-log
mountPath: /var/log/
readOnly: true
terminationGracePeriodSeconds: 120
shareProcessNamespace: true
/var/log
/vector-data
/etc/vector
Vector Reloader Kube RBAC proxy
log-shipper
Vector – collects logs
Reloader – validates config and reloads
Kube RBAC proxy – protects metrics
Node
File
System
PRACTICAL
USE
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
vector 6331 root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6629 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
CASE #1: NO SPACE LEFT ON THE DEVICE
$ lsof -nP | grep '(deleted)'
vector 6331 root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
vector 6331 6628 vector-wo root 44r REG 253,3 10239 33665268 /var/log/.../1.log (deleted)
vector 6331 6629 vector-wo root 25r REG 253,3 10602 72738831 /var/log/.../1.log (deleted)
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 20Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
429
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 50Mb
kubelet
Loki
429
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log 10Mb
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
Vector /var/log/pods
/var/log/pods/{uid}/1.log
kubelet
Loki
429
/var/log/pods/{uid}/1.log (DELETED) 50Mb
10Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
/var/log/pods/{uid}/1.log (DELETED) 50Mb
CASE #1: NO SPACE LEFT ON THE DEVICE
HOW TO SOLVE?
1. Tune buffer settings
Blocking (default) Drop Newest
In Memory (default) Disk buffer
Max events 1000 (default) 10000
2. Rule of a thumb
Let logs go out of the node as quick as possible
3. If you brave enough
sysctl -w fs.file-max=1000 (unsafe)
vector.dev/docs/about/under-the-hood/architecture/buffering-model/
CASE #1: NO SPACE LEFT ON THE DEVICE
CASE #2: PROMETHEUS EXPLODED
uid=a uid=b
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
Vector Prometheus
a
b
CASE #2: PROMETHEUS EXPLODED
uid=a uid=b
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
Vector Prometheus
a
b
CASE #2: PROMETHEUS EXPLODED
uid=c uid=d
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
Vector Prometheus
a
b
c
d
CASE #2: PROMETHEUS EXPLODED
uid=c uid=d
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
Vector Prometheus
a
b
c
d
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
metric_relabel_configs:
- regex: 'file'
action: labeldrop
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
metric_relabel_configs:
- regex: 'file'
action: labeldrop
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/a/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/b/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
a
b
c
d
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/c/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/d/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
c
d
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
uid=f uid=e
vector_checkpoints_total{file=”/var/log/pods/e/{container}/1.log”}
vector_checkpoints_total{file=”/var/log/pods/f/{container}/1.log”}
Vector Prometheus
e
f
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
vector_component_errors_total
time
7
3
3
errors
4
m
ore
errors
expiration
triggered
3
errors
empty!
This behavior makes
the result of the rate
PromQL function
equal to zero.
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
CASE #2: PROMETHEUS EXPLODED
HOW TO SOLVE? expire_metrics_secs=60
Patch for Vector
to remove the file label
CASE #2: PROMETHEUS EXPLODED
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
control-plane node
memory consumption
etcd
memory consumption
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
RAM↑ RAM↑
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
Vector
Vector
Vector
Kubernetes
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME
110 pods
etcd
/registry/<kind>/<namespace>/<name>
ALL pods
RAM↑ RAM↑
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
LIST /api/v1/pods?fieldSelector=spec.nodeName=$NODE_NAME&resourceVersion=0
use_apiserver_cache=true
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: PriorityLevelConfiguration
metadata:
name: limit-list-custom
spec:
type: Limited
limited:
assuredConcurrencyShares: 5
limitResponse:
queuing:
handSize: 4
queueLengthLimit: 50
queues: 16
type: Queue
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: FlowSchema
metadata:
name: limit-list-custom
spec:
priorityLevelConfiguration:
name: limit-list-custom
distinguisherMethod:
type: ByUser
rules:
- resourceRules:
- apiGroups: [""]
clusterScope: true
namespaces: ["*"]
resources: ["pods"]
verbs: ["list", "get"]
subjects:
- kind: ServiceAccount
serviceAccount:
name: ***
namespace: ***
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: PriorityLevelConfiguration
metadata:
name: limit-list-custom
spec:
type: Limited
limited:
assuredConcurrencyShares: 5
limitResponse:
queuing:
handSize: 4
queueLengthLimit: 50
queues: 16
type: Queue
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: FlowSchema
metadata:
name: limit-list-custom
spec:
priorityLevelConfiguration:
name: limit-list-custom
distinguisherMethod:
type: ByUser
rules:
- resourceRules:
- apiGroups: [""]
clusterScope: true
namespaces: ["*"]
resources: ["pods"]
verbs: ["list", "get"]
subjects:
- kind: ServiceAccount
serviceAccount:
name: ***
namespace: ***
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
3. Use kubelet API instead of Kubernetes
Pods metadata can be fetched by requesting the /pods endpoint
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
1. Cache read (resourceVersion=0)
2. Limit concurrent requests (Priority and Fairness API)
3. Use kubelet API instead of Kubernetes
HOW TO SOLVE?
CASE #3: KUBERNETES CONTROL PLANE OUTAGE
CONCLUSION
1. Great to build platforms
2. Vector is awesome, seriously, deploy it today
3. Share practical cases and learn together
github.com/werf
github.com/palark
THANK YOU!
Q&A
@nabokihms
maksim.nabokikh@palark.com
OPEN SOURCE
TOOLS
OUR BLOGS AND
SOCIAL MEDIA
CONTACT US
palark.com
twitter.com/palark_com
MAKSIM
NABOKIKH
Platform Lead

Weitere ähnliche Inhalte

Ähnlich wie OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh

LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
OpenShift Origin
 

Ähnlich wie OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh (20)

Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Kubernetes laravel and kubernetes
Kubernetes   laravel and kubernetesKubernetes   laravel and kubernetes
Kubernetes laravel and kubernetes
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
 
Istio Playground
Istio PlaygroundIstio Playground
Istio Playground
 
Kubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of ContainersKubernetes - Sailing a Sea of Containers
Kubernetes - Sailing a Sea of Containers
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
DevOpSec_KubernetesOperatorUsingJava.pdf
DevOpSec_KubernetesOperatorUsingJava.pdfDevOpSec_KubernetesOperatorUsingJava.pdf
DevOpSec_KubernetesOperatorUsingJava.pdf
 
JDD2015: Kubernetes - Beyond the basics - Paul Bakker
JDD2015: Kubernetes - Beyond the basics - Paul BakkerJDD2015: Kubernetes - Beyond the basics - Paul Bakker
JDD2015: Kubernetes - Beyond the basics - Paul Bakker
 
Who is afraid of privileged containers ?
Who is afraid of privileged containers ?Who is afraid of privileged containers ?
Who is afraid of privileged containers ?
 
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on OpenstackLinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
LinuxCon 2013 Steven Dake on Using Heat for autoscaling OpenShift on Openstack
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
DCEU 18: Docker Container Networking
DCEU 18: Docker Container NetworkingDCEU 18: Docker Container Networking
DCEU 18: Docker Container Networking
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
 
Publishing AwsLlambda Logs Into SplunkCloud
Publishing AwsLlambda Logs Into SplunkCloudPublishing AwsLlambda Logs Into SplunkCloud
Publishing AwsLlambda Logs Into SplunkCloud
 
Pro2516 10 things about oracle and k8s.pptx-final
Pro2516   10 things about oracle and k8s.pptx-finalPro2516   10 things about oracle and k8s.pptx-final
Pro2516 10 things about oracle and k8s.pptx-final
 
From Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in SydneyFrom Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in Sydney
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
Cowboy dating with big data
Cowboy dating with big data Cowboy dating with big data
Cowboy dating with big data
 

Kürzlich hochgeladen

Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
amilabibi1
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Hung Le
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 

Kürzlich hochgeladen (17)

Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 

OSMC 2023 | Making your Kubernetes-based log collection reliable & durable with Vector by Maksim Nabokikh