SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Prometheus and Docker
Monitoring and Management
Brian Brazil
Founder
Who am I?
Engineer passionate about running software reliably in production.
● TCD CS Degree
● Google SRE for 7 years, working on high-scale reliable systems such as
Adwords, Adsense, Ad Exchange, Billing, Database
● Boxever TL Systems&Infrastructure, applied processes and technology to let
allow company to scale and reduce operational load
● Contributor to many open source projects, including Prometheus, Ansible,
Python, Aurora and Zookeeper.
● Founder of Robust Perception, making scalability and efficiency available to
everyone
Why monitor?
● Know when things go wrong
○ To call in a human to prevent a business-level issue, or prevent an issue in advance
● Be able to debug and gain insight
● Trending to see changes over time, and drive technical/business decisions
● To feed into other systems/processes (e.g. QA, security, automation)
Common Monitoring Challenges
Themes common among companies I’ve talk to:
● Monitoring tools are limited, both technically and conceptually
● Tools don’t scale well and are unwieldy to manage
● Operational practices don’t align with the business
For example:
Your customers care about increased latency and it’s in your SLAs. You can only
alert on individual machine CPU usage.
Result: Engineers continuously woken up for non-issues, get fatigued
Fundamental Challenge is Limited Visibility
Prometheus
Inspired by Google’s Borgmon monitoring system.
Started in 2012 by ex-Googlers working in Soundcloud as an open source project,
mainly written in Go. Publically launched in early 2015, and continues to be
independent of any one company.
Over 100 companies have started relying on it since then.
What does Prometheus offer?
● Inclusive Monitoring
● Manageable and Reliable
● Easy to integrate with
● Dashboards
● Powerful data model
● Powerful query language
● Efficient
● Scalable
Services have Internals
Monitor the Internals
Monitor as a Service, not as Machines
Inclusive Monitoring
Don’t monitor just at the edges:
● Instrument client libraries
● Instrument server libraries (e.g. HTTP/RPC)
● Instrument business logic
Library authors get information about usage.
Application developers get monitoring of common components for free.
Dashboards and alerting can be provided out of the box, customised for your
organisation!
Python Instrumentation Example
pip install prometheus_client
from prometheus_client import Summary, start_http_server
REQUEST_DURATION = Summary('request_duration_seconds',
'Request duration in seconds')
@REQUEST_DURATION.time()
def my_handler(request):
pass # Your code here
start_http_server(8000)
Manageable and Reliable
Core Prometheus server is a single binary.
Doesn’t depend on Zookeeper, Consul, Cassandra, Hadoop or the Internet.
Only requires local disk (SSD recommended). No potential for cascading failure.
Pull based, so easy to on run a workstation for testing and rogue servers can’t
push bad metrics.
Advanced service discovery finds what to monitor.
Running Prometheus with Docker
Docker images at https://hub.docker.com/r/prom/
To run Prometheus:
docker run -p 9090:9090 prom/prometheus:latest
This monitors itself.
http://localhost:9090/consoles/prometheus.html has a console.
Easy to integrate with
Many existing integrations: Java, JMX, Python, Go, Ruby, .Net, Machine,
Cloudwatch, EC2, MySQL, PostgreSQL, Haskell, Bash, Node.js, SNMP, Consul,
HAProxy, Mesos, Bind, CouchDB, Django, Mtail, Heka, Memcached, RabbitMQ,
Redis, RethinkDB, Rsyslog, Meteor.js, Minecraft...
Graphite, Statsd, Collectd, Scollector, Munin, Nagios integrations aid transition.
It’s so easy, most of the above were written without the core team even knowing
about them!
Cadvisor natively supports Prometheus
Run Cadvisor:
docker run -v /var/run/:/var/run -v /sys:/sys -p 8080:8080
google/cadvisor
http://localhost:8080/metrics will have metrics for all your containers
Service Discovery
It’s vital to know where each of your instances is meant to be running.
Systems such as Docker Swarm spread applications across machines, so we
need some form of service discovery to find them.
Both Swarm and Prometheus support Consul, so let’s try that.
Setting up Swarm on localhost (1)
MY_IP=1.2.3.4
# - Run consul
docker run -d --name=consul --net=host gliderlabs/consul-server 
-bootstrap
# - Pass the docker daemon -H tcp://0.0.0.0:2375 so it’ll do TCP
# - Register services in Consul
docker run -d --net=host -v /var/run/docker.sock:/tmp/docker.sock 
gliderlabs/registrator consul://localhost:8500
# - Join the swarm
docker run -d swarm join --addr=$MY_IP:2375 
consul://$MY_IP:8500/swarm
Setting up Swarm on localhost (2)
# - Run some machine monitoring
docker run -d -v /var/run/:/var/run -v /sys:/sys -p 8080:8080 
google/cadvisor
docker run -d -v /:/rootfs:ro -p 9100:9100 --net=host 
-e SERVICE_NAME=node prom/node-exporter
# - Start the Swarm Manager
docker run -d -p 2376:2375 swarm manage consul://$MY_IP:8500/swarm
# - Check it’s working
docker -H tcp://127.0.0.1:2376 info
Prometheus configuration to use Consul
scrape_configs:
- job_name: 'consul'
consul_sd_configs:
- server: 127.0.0.1:8500
relabel_configs:
- source_labels: [__meta_consul_service]
regex: (.*) # Default next release
replacement: $1 # Default next release
target_label: job
Run Prometheus
docker run -d --net=host -p 9090:9090 
-e SERVICE_NAME=prometheus 
robustperception/prometheus_local_consul
Visit http://localhost:9090/status to see it running
Automatic service discovery!
Dashboards
Many Dashboarding Options
● Grafana
○ Latest release has full Prometheus support
● Promdash
○ Prometheus-specific Ruby-on-Rails dashboarding solution
○ Direction is moving towards Grafana instead
● Console templates
○ Templating language inside Prometheus
○ Good for having your dashboards checked in, and for fully-custom dashboards
● Expression Browser
○ Included in Prometheus, good for ad-hoc debugging
● Roll your own
○ JSON API
Powerful Data Model
All metrics have arbitrary multi-dimensional labels.
No need to force your model into dotted strings.
Can aggregate, cut, and slice along them.
Supports any double value, labels support full unicode.
Example: Labels from Node Exporter
Powerful Query Language
Can multiply, add, aggregate, join, predict, take quantiles across many metrics in
the same query. Can evaluate right now, and graph back in time.
Answer questions like:
● What’s the 95th percentile latency in the European datacenter?
● How full will the disks be in 4 hours?
● Which services are the top 5 users of CPU?
Can alert based on any query.
Example: Top 5 Docker images by CPU
topk(5,
sum by (image)(
rate(container_cpu_usage_seconds_total{
id=~"/system.slice/docker.*"}[5m]
)
)
)
Efficient
Instrumenting everything means a lot of data.
Prometheus is best in class for lossless storage efficiency, 3.5 bytes per datapoint.
A single server can handle:
● millions of metrics
● hundreds of thousands of datapoints per second
Scalable
Prometheus is easy to run, can give one to each team in each datacenter.
Federation allows pulling key metrics from other Prometheus servers.
When one job is too big for a single Prometheus server, can use
sharding+federation to scale out. Needed with thousands of machines.
Final Thought: Instrument Once, Use Everywhere
When an application is instrumented, tends to be limited in which monitoring
systems it supports.
You’re left with a choice: Have N monitoring systems, or run extra services to act
as shims to translate.
Prometheus clients aren’t just limited to outputting metrics in the Prometheus
format, can also output to Graphite and other monitoring systems. Prometheus
clients can also take in metrics from other monitoring systems.
Prometheus clients can be your instrumentation and metrics interchange, even
when not using Prometheus itself!
What do we do?
Robust Perception provides consulting and training to give you confidence in your
production service's ability to run efficiently and scale with your business.
We can help you:
● Decide if Prometheus is for you
● Manage your transition to Prometheus and resolve issues that arise
● With capacity planning, debugging, infrastructure etc.
We are proud to be among the core contributors to the Prometheus project.
Resources
Official Project Website: prometheus.io
Official Mailing List: prometheus-developers@googlegroups.com
Demo: demo.robustperception.io
Dockerfiles: https://github.com/RobustPerception/docker_examples
Robust Perception Website: www.robustperception.io
Queries: prometheus@robustperception.io

Weitere ähnliche Inhalte

Was ist angesagt?

Monitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_TutorialMonitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_Tutorial
Tim Vaillancourt
 

Was ist angesagt? (20)

Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheus
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with Prometheus
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Monitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_TutorialMonitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_Tutorial
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and Alertmanager
 
Grafana optimization for Prometheus
Grafana optimization for PrometheusGrafana optimization for Prometheus
Grafana optimization for Prometheus
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Prometheus 101
Prometheus 101Prometheus 101
Prometheus 101
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 

Andere mochten auch

Andere mochten auch (20)

Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
Daniel Steigerwald - Este.js - konec velkého Schizma
Daniel Steigerwald - Este.js - konec velkého SchizmaDaniel Steigerwald - Este.js - konec velkého Schizma
Daniel Steigerwald - Este.js - konec velkého Schizma
 
Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)
 
An Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLAn Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQL
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
So You Want to Write an Exporter
So You Want to Write an ExporterSo You Want to Write an Exporter
So You Want to Write an Exporter
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
 
Docker cluster with swarm, consul, registrator and consul-template
Docker cluster with swarm, consul, registrator and consul-templateDocker cluster with swarm, consul, registrator and consul-template
Docker cluster with swarm, consul, registrator and consul-template
 
Processing Kafka Topics for Monitoring with Prometheus
Processing Kafka Topics for Monitoring with PrometheusProcessing Kafka Topics for Monitoring with Prometheus
Processing Kafka Topics for Monitoring with Prometheus
 
Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandra
 
Monitoring in a scalable world
Monitoring in a scalable worldMonitoring in a scalable world
Monitoring in a scalable world
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)
 
Native container monitoring
Native container monitoringNative container monitoring
Native container monitoring
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 

Ähnlich wie Prometheus and Docker (Docker Galway, November 2015)

Ähnlich wie Prometheus and Docker (Docker Galway, November 2015) (20)

Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
System monitoring
System monitoringSystem monitoring
System monitoring
 
From nothing to Prometheus : one year after
From nothing to Prometheus : one year afterFrom nothing to Prometheus : one year after
From nothing to Prometheus : one year after
 
BP101 - 10 Things to Consider when Developing & Deploying Applications in Lar...
BP101 - 10 Things to Consider when Developing & Deploying Applications in Lar...BP101 - 10 Things to Consider when Developing & Deploying Applications in Lar...
BP101 - 10 Things to Consider when Developing & Deploying Applications in Lar...
 
Cloud Native Development
Cloud Native DevelopmentCloud Native Development
Cloud Native Development
 
DockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging WorkshopDockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging Workshop
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and Heroku
 
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...
 
DevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheusDevOps Braga #15: Agentless monitoring with icinga and prometheus
DevOps Braga #15: Agentless monitoring with icinga and prometheus
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfPrometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf
 
Copy of Silk performer - KT.pptx
Copy of Silk performer - KT.pptxCopy of Silk performer - KT.pptx
Copy of Silk performer - KT.pptx
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools
 
202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
 

Mehr von Brian Brazil

Mehr von Brian Brazil (11)

OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)
 
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
 
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Evolution of the Prometheus TSDB  (Percona Live Europe 2017)Evolution of the Prometheus TSDB  (Percona Live Europe 2017)
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
 
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
 
Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)
 
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
 
Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)
 

Kürzlich hochgeladen

Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
ellan12
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 

Prometheus and Docker (Docker Galway, November 2015)

  • 1. Prometheus and Docker Monitoring and Management Brian Brazil Founder
  • 2. Who am I? Engineer passionate about running software reliably in production. ● TCD CS Degree ● Google SRE for 7 years, working on high-scale reliable systems such as Adwords, Adsense, Ad Exchange, Billing, Database ● Boxever TL Systems&Infrastructure, applied processes and technology to let allow company to scale and reduce operational load ● Contributor to many open source projects, including Prometheus, Ansible, Python, Aurora and Zookeeper. ● Founder of Robust Perception, making scalability and efficiency available to everyone
  • 3. Why monitor? ● Know when things go wrong ○ To call in a human to prevent a business-level issue, or prevent an issue in advance ● Be able to debug and gain insight ● Trending to see changes over time, and drive technical/business decisions ● To feed into other systems/processes (e.g. QA, security, automation)
  • 4. Common Monitoring Challenges Themes common among companies I’ve talk to: ● Monitoring tools are limited, both technically and conceptually ● Tools don’t scale well and are unwieldy to manage ● Operational practices don’t align with the business For example: Your customers care about increased latency and it’s in your SLAs. You can only alert on individual machine CPU usage. Result: Engineers continuously woken up for non-issues, get fatigued
  • 5. Fundamental Challenge is Limited Visibility
  • 6. Prometheus Inspired by Google’s Borgmon monitoring system. Started in 2012 by ex-Googlers working in Soundcloud as an open source project, mainly written in Go. Publically launched in early 2015, and continues to be independent of any one company. Over 100 companies have started relying on it since then.
  • 7. What does Prometheus offer? ● Inclusive Monitoring ● Manageable and Reliable ● Easy to integrate with ● Dashboards ● Powerful data model ● Powerful query language ● Efficient ● Scalable
  • 10. Monitor as a Service, not as Machines
  • 11. Inclusive Monitoring Don’t monitor just at the edges: ● Instrument client libraries ● Instrument server libraries (e.g. HTTP/RPC) ● Instrument business logic Library authors get information about usage. Application developers get monitoring of common components for free. Dashboards and alerting can be provided out of the box, customised for your organisation!
  • 12. Python Instrumentation Example pip install prometheus_client from prometheus_client import Summary, start_http_server REQUEST_DURATION = Summary('request_duration_seconds', 'Request duration in seconds') @REQUEST_DURATION.time() def my_handler(request): pass # Your code here start_http_server(8000)
  • 13. Manageable and Reliable Core Prometheus server is a single binary. Doesn’t depend on Zookeeper, Consul, Cassandra, Hadoop or the Internet. Only requires local disk (SSD recommended). No potential for cascading failure. Pull based, so easy to on run a workstation for testing and rogue servers can’t push bad metrics. Advanced service discovery finds what to monitor.
  • 14. Running Prometheus with Docker Docker images at https://hub.docker.com/r/prom/ To run Prometheus: docker run -p 9090:9090 prom/prometheus:latest This monitors itself. http://localhost:9090/consoles/prometheus.html has a console.
  • 15. Easy to integrate with Many existing integrations: Java, JMX, Python, Go, Ruby, .Net, Machine, Cloudwatch, EC2, MySQL, PostgreSQL, Haskell, Bash, Node.js, SNMP, Consul, HAProxy, Mesos, Bind, CouchDB, Django, Mtail, Heka, Memcached, RabbitMQ, Redis, RethinkDB, Rsyslog, Meteor.js, Minecraft... Graphite, Statsd, Collectd, Scollector, Munin, Nagios integrations aid transition. It’s so easy, most of the above were written without the core team even knowing about them!
  • 16. Cadvisor natively supports Prometheus Run Cadvisor: docker run -v /var/run/:/var/run -v /sys:/sys -p 8080:8080 google/cadvisor http://localhost:8080/metrics will have metrics for all your containers
  • 17. Service Discovery It’s vital to know where each of your instances is meant to be running. Systems such as Docker Swarm spread applications across machines, so we need some form of service discovery to find them. Both Swarm and Prometheus support Consul, so let’s try that.
  • 18. Setting up Swarm on localhost (1) MY_IP=1.2.3.4 # - Run consul docker run -d --name=consul --net=host gliderlabs/consul-server -bootstrap # - Pass the docker daemon -H tcp://0.0.0.0:2375 so it’ll do TCP # - Register services in Consul docker run -d --net=host -v /var/run/docker.sock:/tmp/docker.sock gliderlabs/registrator consul://localhost:8500 # - Join the swarm docker run -d swarm join --addr=$MY_IP:2375 consul://$MY_IP:8500/swarm
  • 19. Setting up Swarm on localhost (2) # - Run some machine monitoring docker run -d -v /var/run/:/var/run -v /sys:/sys -p 8080:8080 google/cadvisor docker run -d -v /:/rootfs:ro -p 9100:9100 --net=host -e SERVICE_NAME=node prom/node-exporter # - Start the Swarm Manager docker run -d -p 2376:2375 swarm manage consul://$MY_IP:8500/swarm # - Check it’s working docker -H tcp://127.0.0.1:2376 info
  • 20. Prometheus configuration to use Consul scrape_configs: - job_name: 'consul' consul_sd_configs: - server: 127.0.0.1:8500 relabel_configs: - source_labels: [__meta_consul_service] regex: (.*) # Default next release replacement: $1 # Default next release target_label: job
  • 21. Run Prometheus docker run -d --net=host -p 9090:9090 -e SERVICE_NAME=prometheus robustperception/prometheus_local_consul Visit http://localhost:9090/status to see it running
  • 24. Many Dashboarding Options ● Grafana ○ Latest release has full Prometheus support ● Promdash ○ Prometheus-specific Ruby-on-Rails dashboarding solution ○ Direction is moving towards Grafana instead ● Console templates ○ Templating language inside Prometheus ○ Good for having your dashboards checked in, and for fully-custom dashboards ● Expression Browser ○ Included in Prometheus, good for ad-hoc debugging ● Roll your own ○ JSON API
  • 25. Powerful Data Model All metrics have arbitrary multi-dimensional labels. No need to force your model into dotted strings. Can aggregate, cut, and slice along them. Supports any double value, labels support full unicode.
  • 26. Example: Labels from Node Exporter
  • 27. Powerful Query Language Can multiply, add, aggregate, join, predict, take quantiles across many metrics in the same query. Can evaluate right now, and graph back in time. Answer questions like: ● What’s the 95th percentile latency in the European datacenter? ● How full will the disks be in 4 hours? ● Which services are the top 5 users of CPU? Can alert based on any query.
  • 28. Example: Top 5 Docker images by CPU topk(5, sum by (image)( rate(container_cpu_usage_seconds_total{ id=~"/system.slice/docker.*"}[5m] ) ) )
  • 29. Efficient Instrumenting everything means a lot of data. Prometheus is best in class for lossless storage efficiency, 3.5 bytes per datapoint. A single server can handle: ● millions of metrics ● hundreds of thousands of datapoints per second
  • 30. Scalable Prometheus is easy to run, can give one to each team in each datacenter. Federation allows pulling key metrics from other Prometheus servers. When one job is too big for a single Prometheus server, can use sharding+federation to scale out. Needed with thousands of machines.
  • 31. Final Thought: Instrument Once, Use Everywhere When an application is instrumented, tends to be limited in which monitoring systems it supports. You’re left with a choice: Have N monitoring systems, or run extra services to act as shims to translate. Prometheus clients aren’t just limited to outputting metrics in the Prometheus format, can also output to Graphite and other monitoring systems. Prometheus clients can also take in metrics from other monitoring systems. Prometheus clients can be your instrumentation and metrics interchange, even when not using Prometheus itself!
  • 32. What do we do? Robust Perception provides consulting and training to give you confidence in your production service's ability to run efficiently and scale with your business. We can help you: ● Decide if Prometheus is for you ● Manage your transition to Prometheus and resolve issues that arise ● With capacity planning, debugging, infrastructure etc. We are proud to be among the core contributors to the Prometheus project.
  • 33. Resources Official Project Website: prometheus.io Official Mailing List: prometheus-developers@googlegroups.com Demo: demo.robustperception.io Dockerfiles: https://github.com/RobustPerception/docker_examples Robust Perception Website: www.robustperception.io Queries: prometheus@robustperception.io