A high-level view of container monitoring and its challenges in data sourcing (short-lived connections) and viz (hairballs). All in the context of Kubernetes, Prometheus, and Weave Cloud
3. Rogue waves present
considerable danger for
several reasons:
• unpredictable
• may appear suddenly
or without warning
• and can impact with
tremendous force.
4. Performance Methodologies
• For system engineers
- ways to analyse unfamiliar systems
• For app developers
- guidance for metric and dashboard
design
- Brendan Gregg’s Systems Methodology
5. Traffic Light Anti-Method
1. Turn all metrics into traffic lights
2. Everything green?
No worries, mate.
- Brendan Gregg’s Systems Methodology
🚦
12. Stop sampling,
start listening
EBPF
• user-defined sandboxed kernel
programs
• live-instrumentation on vanilla kernel
• listen to connection events same
way as conntrack does, but with
PID(!)
19. USE vs RED
USE: For every resource, check:
• Utilisation
• Saturation
• Errors
RED: For every service, check:
• Request Rate
• Error rate
• Duration (latency distribution)
* http://www.brendangregg.com/usemethod.html
20. New data sources
• plugins can add
metadata and metrics
• EBPF (ongoing)
• custom
instrumentation
21.
22. Prometheus & K8s
• Kubernetes is already
instrumented for
Prometheus
• Application-level
metrics from
instrumentation