Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Full Stack Monitoring with Prometheus and Grafana

時間:2018-02-10 台灣資料工程協會 2018 第一季技術工作坊
講題:使用普羅米修斯打造全棧式監控與告警平台
Building Full Stack Monitor and Notification with Prometheus

身為管理混合式雲端基礎建設的維運人員,面對分散在不同監控平台的數據是否感到頭疼呢?身為開發者,您是否苦於欠缺歷史監控數據來除錯或排查程式效能問題呢?本次分享將從動機面開始說明為何需要全棧式監控與告警平台,接著介紹過去一季講者如何使用普羅米修斯(Prometheus)與 Grafana 針對網路層、實體機器、虛擬機器、容器、中介軟體層(Ex. Apache Cassandra、Apache Kafka、CNCF Fluentd)、應用程式層來建立資料串流(Data Pipeline)的監控儀表板。礙於無法展示真實公司的環境,本分享將使用 Docker Compose 進行全棧式監控與告警平台的概念,也藉此逐一介紹搭建全棧式監控與告警平台會用到哪些普羅米修斯(Prometheus)的各類資料蒐集器(Exporter)。

As a Hybrid Cloud Operator, are you tired of collecting monitor metrics from different monitor services? As a Developer, do you need historical application and infrastructure metrics to debug or improve application performance? In this talk, I'll first talk about why should we build Full Stack Monitor and Notification with Prometheus and Grafana. I'll share my personal experience about monitoring network devices, physical machines, virtual machines, docker containers, Middleware (Ex. Apache Cassandra, Apapche Kafka, CNCF Fluentd) and Application metrics. I'll demonstrate an End-to-End Data Pipeline Dashboard with Docker Compose examples and introduce different kinds of Prometheus Exporter used for different monitor targets.

  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Full Stack Monitoring with Prometheus and Grafana

  1. 1. Build Full Stack Monitoring and Notification 
 with Prometheus 1 Jazz Yao-Tsung Wang Initiator of Taiwan Data Engineering Association Co-Founder of Taiwan Hadoop User Group Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
  2. 2. Hello! I am Jazz Wang Co-Founder of Hadoop.TW Initiator of Taiwan Data Engineering Association (TDEA) Hadoop Evangelist since 2008. Open Source Promoter. System Admin (Ops). - 11 years (2002/08 ~ 2014/02) Researcher in HPC field. - 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP), Product Management of ‘Big Data Platform Management Product’ - 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang 2
  3. 3. 1. / / Why do I need Full Stack Monitoring and Notification ? Let’s start with Jazz’s Jobs / Pains / Gains 3
  4. 4. AWS Hybrid …. 4 VM Azure GCP
  5. 5. 5 NetAdmin Research Developer Security Cloud Ops SysAdmin Data Engineer
  6. 6. 6 NetAdmin Research Developer Security Cacti NewRelic 
 Server OpsCenter Kafka Manager NewRelic 
 Synthetic / APM Status Cake ++ ++ DataDog
  7. 7. Pain ▷ Data Fragments ▷ ▷ ▷ Data Retention ▷ 7 ▷ Black Box ▷ (Metrics) ▷ Metrics ▷ Vendor Lock-in ▷ 7
  8. 8. Gain — ▷ Centralized Time-serious Database ▷ ▷ Support Alert Notification ▷ Slack, E-mail, SMS … ▷ Self-defined Data Retention Rate ▷ ▷ White Box ▷ Metrics = (Metrics) ▷ Self-defined Dashboard ▷ Ex. Data Pipeline 8
  9. 9. ( ) …. Inspired by Outlier … https://www.outlyer.com/ ~~ ~~ 9
  10. 10. 2. / / Introduction to Prometheus Ecosystem Features / Pain Relievers / Gain Creators 10
  11. 11. 11 Concepts
  12. 12. Common Building Blocks 12 Target Collector Exporter Time-Series Database Rule Dashboard Alert Message Collector Exporter Exporter Dashboard Dashboard TargetTarget Rule Rule Alert Message Annotation Push Pull
  13. 13. Ranking of Time Series DBMS 13https://db-engines.com/en/ranking/time+series+dbms
  14. 14. Comparison of Common Monitor and Notification System 14 Target / Exporter DBMS Dashboard Alert snmpd Pull Cacti — Device ( snmpwalk ) RRDTool Cacti — Graph Plugin* gmond Pull Ganglia gmetad RRDTool Ganglia Nagios newrelic-agent Push (?) NewRelic ?? NewRelic NewRelic Alert statsD Push Carbon / whisper Graphite Grafana Grafana Telegraf Push Telegraf InfluxDB Grafana Grafana Pull Push* snmp_expoter node_exporter jmx_exporter … Prometheus Grafana AlertManager
  15. 15. 15 About Prometheus ▷ https://prometheus.io/ ▷ 2012 11 SoundCloud ▷ Go Apache 2.0 ▷ 2016 Cloud Native Computing Foundation
 Kubernates K8S Prometheus ▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08 ▷ PromQL ▷ Grafana ▷ AlertManager ▷ v2.0
  16. 16. 16 Components of Prometheus Push Pull Query
  17. 17. Comparison of Time-Series DBMS 17 Prometheus HA Prometheus Data Model
  18. 18. Client Libraries 18 ▷ Official Prometheus client library ▷ Go ▷ Java or Scala ▷ Python ▷ Ruby ▷ Unofficial 3rd-party client library ▷ Bash ▷ C++ ▷ Common Lisp ▷ Elixir ▷ Erlang ▷ Haskell ▷ Lua for Nginx ▷ Lua for Tarantool ▷ .NET / C# ▷ Node.js ▷ PHP ▷ Rust
  19. 19. 19 3. Docker Compose Full Stack
  20. 20. Show me the source code!! ○ https://github.com/jazzwang/prometheus-labs ○ Docker Compose ○ 20
  21. 21. — Data Pipeline 21 in_dummy Fluentd out_kafka Kafka in_kafka_group Fluentd out_file
  22. 22. Network Layer ▷ snmp_exporter ○ https://github.com/prometheus/snmp_exporter ○ snmp Metrics ○ MIB OID ○ 
 snmp_exporter generator snmp.yml ▷ blackbox_exporter ○ https://github.com/prometheus/blackbox_exporter ○ HTTP, HTTPS, DNS, TCP ICMP ○ 
 Web Service SSH DNS Ping blackbox_exporter 22
  23. 23. System Layer ▷ node_exporter ○ https://github.com/prometheus/node_exporter ○ OS Level Metrics 23
  24. 24. Middleware Layer ▷ jmx_exporter ○ https://github.com/prometheus/jmx_exporter ○ Java YAML Prometheus Metrics ○ ■ Apache Kafka ■ Apache Cassandra ■ Apache Flink ■ Apache Spark ■ Apache Tomcat ■ Apache ZooKeeper ■ Apache ActiveMQ Artemis 2.x ■ WebLogic ■ WildFly 10 24
  25. 25. Kafka ▷ `jmx_exporter` Kafka Cassandra ○ Docker - https://github.com/RobustPerception/docker_examples ▷ kafka_topic_exporter ○ Java Jetty ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ ZK topic_partition ○ https://github.com/cloudflare/kafka_zookeeper_exporter ▷ prometheus-kafka-consumer-group-exporter ○ Python Metrics consumer_group_offset topic_highwater Lag ○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter ▷ burrow_exporter ○ LinkedIn Kafka Lag Burrow (Go , sliding window ) ○ https://github.com/jirwin/burrow_exporter 25
  26. 26. Kafka ▷ kafka-consumer-group-exporter ○ Go kafka-consumer-groups.sh ○ https://github.com/kawamuray/prometheus-kafka-consumer-group- exporter ▷ kafka-prometheus-exporter ○ Go consumergoup_lag metrics ○ Kafka 0.8 (ZK) ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ Go Metrics ○ Kafka 0.9 (KF) ○ https://github.com/danielqsj/kafka_exporter 26
  27. 27. Fluentd ▷ fluent-agent-lite_exporter ○ Tagamoris fluent-agent-lite [1] ○ https://github.com/matsumana/fluent-agent-lite_exporter ○ [1] https://github.com/tagomoris/fluent-agent-lite ▷ fluent-plugin-prometheus ○ fluentd → monitor_agent → fluent-plugin-prometheus ○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd ○ https://github.com/fluent/fluent-plugin-prometheus ▷ fluentd_exporter ○ Release, ○ https://github.com/wyukawa/fluentd_exporter ▷ fluentd_exporter ○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus ○ https://github.com/wyukawa/fluentd_exporter 27
  28. 28. Application Layer 28 ▷ https://prometheus.io/docs/instrumenting/clientlibs/
  29. 29. Application Layer 29 ▷ http://metrics.dropwizard.io/4.0.0/
  30. 30. 30 4. Lesson Learned
  31. 31. Lesson Learned ▷ Lesson #1
 
 Prometheus 
 ▷ Lesson #2
 
 
 Metrics exporter 
 ○ exporter
 https://prometheus.io/docs/instrumenting/exporters/ ○ Port
 https://github.com/prometheus/prometheus/wiki/Default-port-allocations ○ exporter Metrics 31
  32. 32. Lesson Learned ▷ ○ github ○ exporter Metrics ○ http://prometheus:9090/graph ○ Grafana Dashboard ○ Grafana Alert 32
  33. 33. 33 Thanks! Any questions? You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang https://github.com/jazzwang Github *^__^*

×