SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Collecting Metrics
With Ganglia and Friends

Cambridge Geek Night 28th March 2011


gareth rushgrove | morethanseven.net   http://www.flickr.com/photos/memestate/45986749
Gareth Rushgrove


gareth rushgrove | morethanseven.net
freeagentcentral.com




Work at FreeAgent


gareth rushgrove | morethanseven.net
Blog at morethanseven.net


gareth rushgrove | morethanseven.net
Curate devopsweekly.com


gareth rushgrove | morethanseven.net
-      Capacity planning metrics
           -      Metrics for your application
           -      Business analytics
           -      Having everything in one place



Covering (Business Version)


gareth rushgrove | morethanseven.net
-      Ganglia Store metrics and view graphs
           -      Logster Get log files into Ganglia
           -      Gmetric Get anything into Ganglia
           -      Syslog Using Loggly to view individual log items



Covering (Tech Version)


gareth rushgrove | morethanseven.net
Everyone Uses Something Like?


gareth rushgrove | morethanseven.net
Use Something Like This Too


gareth rushgrove | morethanseven.net
“Ganglia is a scalable distributed monitoring
        system for high-performance computing
        systems such as clusters and Grids.
        ganglia.sourceforge.net




What is Ganglia?


gareth rushgrove | morethanseven.net
Example: vagrantbox.es


gareth rushgrove | morethanseven.net
Load Averages


gareth rushgrove | morethanseven.net
CPU


gareth rushgrove | morethanseven.net
Aggregate Graphs


gareth rushgrove | morethanseven.net
Across Entire Cluster


gareth rushgrove | morethanseven.net
“A strategy for anticipating future workloads
        of your computers, with the aim of creating
        a computing environment that can handle
        future workload
        IBM




Predicting When Your System Will Fail


gareth rushgrove | morethanseven.net
Disk Space


gareth rushgrove | morethanseven.net
Monitoring Your Application


gareth rushgrove | morethanseven.net
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.1" 200 2081 "-" "Mozilla/5.0
(Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko)
Version/5.0.4 Safari/533.20.27"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"

Web Server Logs
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
gareth rushgrove | morethanseven.net
86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
Logster from Etsy


gareth rushgrove | morethanseven.net
Tail a log file and filter each line to generate metrics that can be sent to
      common monitoring packages.

      Options:
        -p METRIC_PREFIX, --metric-prefix=METRIC_PREFIX
                              Add prefix to all published metrics. This is for
                              people that may multiple instances of same service on
                              same host.
        --gmetric-options=GMETRIC_OPTIONS
                              Options to pass to gmetric such as -d 180 -c
                              /etc/ganglia/gmond.conf (default). These are passed
                              directly to gmetric.
        --graphite-host=GRAPHITE_HOST
                              Hostname and port for Graphite collector, e.g.
                              graphite.example.com:2003
        -s STATE_DIR, --state-dir=STATE_DIR
                              Where to store the logtail state file. Default
                              location /var/run
        -d, --dry-run         Parse the log file but send stats to standard output.
        -D, --debug           Provide more verbose logging for debugging.




Logster


gareth rushgrove | morethanseven.net
logster SampleGangliaLogster /../access.log




Logster Command Line


gareth rushgrove | morethanseven.net
HTTP Responses with a 2xx Status Code


gareth rushgrove | morethanseven.net
The Ganglia Metric Client (gmetric) announces a metric
      on the list of defined send channels defined in a configuration file

      Usage: gmetric [OPTIONS]...
        -V, --version       Print version and exit
        -c, --conf=STRING   The configuration file to use for finding send channels
                              (default='/etc/ganglia/gmond.conf')
        -n, --name=STRING   Name of the metric
        -v, --value=STRING Value of the metric
        -t, --type=STRING   Either
                              string|int8|uint8|int16|uint16|int32|uint32|float|double
        -u, --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius
                              (default='')
        -s, --slope=STRING Either zero|positive|negative|both (default='both')
        -x, --tmax=INT      The maximum time in seconds between gmetric calls
                              (default='60')
        -d, --dmax=INT      The lifetime in seconds of this metric (default='0')
        -S, --spoof=STRING IP address and name of host/device (colon separated) we
                              are spoofing (default='')
        -H, --heartbeat     spoof a heartbeat message (use with spoof option)




Gmetric


gareth rushgrove | morethanseven.net
Gmetric Scripts for Common Applications


gareth rushgrove | morethanseven.net
gmetric -n sales -v 200 -t float




Gmetric Command Line


gareth rushgrove | morethanseven.net
Our Custom Metric in Ganglia


gareth rushgrove | morethanseven.net
import subprocess

      from bottle import route, run, abort, default_app

      @route('/:name/:value')
      def index(name, value):
          try:
               cmd = 'gmetric -n %s -v %s -t float' % (name, value)
               subprocess.check_call(
                   cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
               return "Success: %s" % cmd
          except subprocess.CalledProcessError:
               abort(500, "Error")

      app = default_app()




Gmetric HTTP Interface


gareth rushgrove | morethanseven.net
http://../sales/200




Gmetric URL


gareth rushgrove | morethanseven.net
import subprocess
      import SocketServer

      class GmetricTCPHandler(SocketServer.BaseRequestHandler):

             def handle(self):
                 self.data = self.request.recv(1024).strip()
                 items = self.data.split(' ')
                 try:
                      cmd = 'gmetric -n %s -v %s -t float' % (items[0], items[1])
                      subprocess.check_call(
                          cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
                      return "Success: %s" % cmd
                 except Exception:
                      return "Error"

      if __name__ == "__main__":
          HOST, PORT = "0.0.0.0", 8001
          server = SocketServer.TCPServer((HOST, PORT), GmetricTCPHandler)
          server.serve_forever()




Gmetric TCP Interface


gareth rushgrove | morethanseven.net
sales 200




Gmetric TCP


gareth rushgrove | morethanseven.net
“Syslog is a standard for logging program
        messages. It allows separation of the
        software that generates messages from the
        system that stores them and the software
        that reports and analyzes them.
        Wikipedia


Syslog


gareth rushgrove | morethanseven.net
Loggly - Logging as a Service


gareth rushgrove | morethanseven.net
View logs


gareth rushgrove | morethanseven.net
Logstash


gareth rushgrove | morethanseven.net
Graylog2


gareth rushgrove | morethanseven.net
-      Database table sizes
           -      Cache hits
           -      Time taken for test runs
           -      Codebase size
           -      Signups, sales, subscriptions
           -      Twitter followers


Other Things You Could Monitor


gareth rushgrove | morethanseven.net
-      Wikipedia http://ganglia.wikimedia.org/
           -      Install Ganglia deb and rpm packages available
           -      Add system metrics web servers, databases
           -      Add business metrics users, sales, tweets
           -      Try Loggly or at least investigate syslog


What Next?


gareth rushgrove | morethanseven.net
Reading


gareth rushgrove | morethanseven.net
CBGN11



2 months free on FreeAgent


gareth rushgrove | morethanseven.net
Questions?


gareth rushgrove | morethanseven.net   http://flickr.com/photos/psd/102332391/

Weitere ähnliche Inhalte

Was ist angesagt?

Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
Tier1 App
 

Was ist angesagt? (20)

Prezentacja zimowisko 2014
Prezentacja zimowisko 2014Prezentacja zimowisko 2014
Prezentacja zimowisko 2014
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
BGP zombie routes
BGP zombie routesBGP zombie routes
BGP zombie routes
 
Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016Debugging Distributed Systems - Velocity Santa Clara 2016
Debugging Distributed Systems - Velocity Santa Clara 2016
 
Kernel Recipes 2015: Introduction to Kernel Power Management
Kernel Recipes 2015: Introduction to Kernel Power ManagementKernel Recipes 2015: Introduction to Kernel Power Management
Kernel Recipes 2015: Introduction to Kernel Power Management
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
OpenStack networking juno l3 h-a, dvr
OpenStack networking   juno l3 h-a, dvrOpenStack networking   juno l3 h-a, dvr
OpenStack networking juno l3 h-a, dvr
 
XDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @CloudflareXDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @Cloudflare
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
Le guide de dépannage de la jvm
Le guide de dépannage de la jvmLe guide de dépannage de la jvm
Le guide de dépannage de la jvm
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置Percona XtraDB 集群安装与配置
Percona XtraDB 集群安装与配置
 
MySQL Galera 集群
MySQL Galera 集群MySQL Galera 集群
MySQL Galera 集群
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
Thanos - Prometheus on Scale
Thanos - Prometheus on ScaleThanos - Prometheus on Scale
Thanos - Prometheus on Scale
 
Quic illustrated
Quic illustratedQuic illustrated
Quic illustrated
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
 

Andere mochten auch

LucilleCros_Portfolio_2016_eng
LucilleCros_Portfolio_2016_engLucilleCros_Portfolio_2016_eng
LucilleCros_Portfolio_2016_eng
Lucille Cros
 
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel FinalTrends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
Ramkumar Nagappan
 

Andere mochten auch (9)

LucilleCros_Portfolio_2016_eng
LucilleCros_Portfolio_2016_engLucilleCros_Portfolio_2016_eng
LucilleCros_Portfolio_2016_eng
 
30th January 2017 - The bible – Book of Lamentations
30th January 2017  -  The bible – Book of Lamentations30th January 2017  -  The bible – Book of Lamentations
30th January 2017 - The bible – Book of Lamentations
 
X-ISS: Complete HPC Management Solutions
X-ISS: Complete HPC Management SolutionsX-ISS: Complete HPC Management Solutions
X-ISS: Complete HPC Management Solutions
 
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...Managing and Deploying High Performance Computing Clusters using Windows HPC ...
Managing and Deploying High Performance Computing Clusters using Windows HPC ...
 
Big Data Thailand 2016 Meetup 1
Big Data Thailand 2016  Meetup 1Big Data Thailand 2016  Meetup 1
Big Data Thailand 2016 Meetup 1
 
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel FinalTrends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
Trends in HPC Power Metrics and where to from here Ramkumar Nagappan Intel Final
 
Huawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPCHuawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPC
 
Big Data HPC Convergence
Big Data HPC ConvergenceBig Data HPC Convergence
Big Data HPC Convergence
 
Monitoring with Nagios and Ganglia
Monitoring with Nagios and GangliaMonitoring with Nagios and Ganglia
Monitoring with Nagios and Ganglia
 

Ähnlich wie Metrics with Ganglia

Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 

Ähnlich wie Metrics with Ganglia (20)

如何透過 Go-kit 快速搭建微服務架構應用程式實戰
如何透過 Go-kit 快速搭建微服務架構應用程式實戰如何透過 Go-kit 快速搭建微服務架構應用程式實戰
如何透過 Go-kit 快速搭建微服務架構應用程式實戰
 
Puppet Data Mining
Puppet Data MiningPuppet Data Mining
Puppet Data Mining
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performance
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes with ...
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes  with ...GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes  with ...
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes with ...
 
Monitoring microservices with Prometheus
Monitoring microservices with PrometheusMonitoring microservices with Prometheus
Monitoring microservices with Prometheus
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Managing Large-scale Networks with Trigger
Managing Large-scale Networks with TriggerManaging Large-scale Networks with Trigger
Managing Large-scale Networks with Trigger
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache Airflow
 
Dynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web siteDynamic Tracing of your AMP web site
Dynamic Tracing of your AMP web site
 
PostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksPostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacks
 
Logstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtimeLogstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtime
 

Mehr von Gareth Rushgrove

Mehr von Gareth Rushgrove (20)

Communications Between Tribes
Communications Between TribesCommunications Between Tribes
Communications Between Tribes
 
The Challenges of Container Configuration
The Challenges of Container ConfigurationThe Challenges of Container Configuration
The Challenges of Container Configuration
 
Puppet and Openshift
Puppet and OpenshiftPuppet and Openshift
Puppet and Openshift
 
Two Sides of Google Infrastructure for Everyone Else
Two Sides of Google Infrastructure for Everyone ElseTwo Sides of Google Infrastructure for Everyone Else
Two Sides of Google Infrastructure for Everyone Else
 
Thinking Evil Thoughts
Thinking Evil ThoughtsThinking Evil Thoughts
Thinking Evil Thoughts
 
Web operations
Web operationsWeb operations
Web operations
 
Learnings from govuk
Learnings from govukLearnings from govuk
Learnings from govuk
 
Config managament for development environments ii
Config managament for development environments iiConfig managament for development environments ii
Config managament for development environments ii
 
Varnish Caching
Varnish CachingVarnish Caching
Varnish Caching
 
Vagrant and Configuration Management
Vagrant and Configuration ManagementVagrant and Configuration Management
Vagrant and Configuration Management
 
You're Going To Need A Bigger Toolbox
You're Going To Need A Bigger ToolboxYou're Going To Need A Bigger Toolbox
You're Going To Need A Bigger Toolbox
 
Devops
DevopsDevops
Devops
 
Automating web site deployment
Automating web site deploymentAutomating web site deployment
Automating web site deployment
 
Message Queues for Web Applications
Message Queues for Web ApplicationsMessage Queues for Web Applications
Message Queues for Web Applications
 
Beyond basic web development
Beyond basic web developmentBeyond basic web development
Beyond basic web development
 
Self Education for Web Professionals
Self Education for Web ProfessionalsSelf Education for Web Professionals
Self Education for Web Professionals
 
What to Build with Google App Engine
What to Build with Google App EngineWhat to Build with Google App Engine
What to Build with Google App Engine
 
App Engine for Python Developers
App Engine for Python DevelopersApp Engine for Python Developers
App Engine for Python Developers
 
Testing Django Applications
Testing Django ApplicationsTesting Django Applications
Testing Django Applications
 
Design Strategies for a Distributed Web
Design Strategies for a Distributed WebDesign Strategies for a Distributed Web
Design Strategies for a Distributed Web
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Metrics with Ganglia

  • 1. Collecting Metrics With Ganglia and Friends Cambridge Geek Night 28th March 2011 gareth rushgrove | morethanseven.net http://www.flickr.com/photos/memestate/45986749
  • 2. Gareth Rushgrove gareth rushgrove | morethanseven.net
  • 3. freeagentcentral.com Work at FreeAgent gareth rushgrove | morethanseven.net
  • 4. Blog at morethanseven.net gareth rushgrove | morethanseven.net
  • 6. - Capacity planning metrics - Metrics for your application - Business analytics - Having everything in one place Covering (Business Version) gareth rushgrove | morethanseven.net
  • 7. - Ganglia Store metrics and view graphs - Logster Get log files into Ganglia - Gmetric Get anything into Ganglia - Syslog Using Loggly to view individual log items Covering (Tech Version) gareth rushgrove | morethanseven.net
  • 8. Everyone Uses Something Like? gareth rushgrove | morethanseven.net
  • 9. Use Something Like This Too gareth rushgrove | morethanseven.net
  • 10. “Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. ganglia.sourceforge.net What is Ganglia? gareth rushgrove | morethanseven.net
  • 12. Load Averages gareth rushgrove | morethanseven.net
  • 13. CPU gareth rushgrove | morethanseven.net
  • 14. Aggregate Graphs gareth rushgrove | morethanseven.net
  • 15. Across Entire Cluster gareth rushgrove | morethanseven.net
  • 16. “A strategy for anticipating future workloads of your computers, with the aim of creating a computing environment that can handle future workload IBM Predicting When Your System Will Fail gareth rushgrove | morethanseven.net
  • 17. Disk Space gareth rushgrove | morethanseven.net
  • 18. Monitoring Your Application gareth rushgrove | morethanseven.net
  • 19. 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.1" 200 2081 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" Web Server Logs 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0" 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0" gareth rushgrove | morethanseven.net 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
  • 20. Logster from Etsy gareth rushgrove | morethanseven.net
  • 21. Tail a log file and filter each line to generate metrics that can be sent to common monitoring packages. Options: -p METRIC_PREFIX, --metric-prefix=METRIC_PREFIX Add prefix to all published metrics. This is for people that may multiple instances of same service on same host. --gmetric-options=GMETRIC_OPTIONS Options to pass to gmetric such as -d 180 -c /etc/ganglia/gmond.conf (default). These are passed directly to gmetric. --graphite-host=GRAPHITE_HOST Hostname and port for Graphite collector, e.g. graphite.example.com:2003 -s STATE_DIR, --state-dir=STATE_DIR Where to store the logtail state file. Default location /var/run -d, --dry-run Parse the log file but send stats to standard output. -D, --debug Provide more verbose logging for debugging. Logster gareth rushgrove | morethanseven.net
  • 22. logster SampleGangliaLogster /../access.log Logster Command Line gareth rushgrove | morethanseven.net
  • 23. HTTP Responses with a 2xx Status Code gareth rushgrove | morethanseven.net
  • 24. The Ganglia Metric Client (gmetric) announces a metric on the list of defined send channels defined in a configuration file Usage: gmetric [OPTIONS]... -V, --version Print version and exit -c, --conf=STRING The configuration file to use for finding send channels (default='/etc/ganglia/gmond.conf') -n, --name=STRING Name of the metric -v, --value=STRING Value of the metric -t, --type=STRING Either string|int8|uint8|int16|uint16|int32|uint32|float|double -u, --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius (default='') -s, --slope=STRING Either zero|positive|negative|both (default='both') -x, --tmax=INT The maximum time in seconds between gmetric calls (default='60') -d, --dmax=INT The lifetime in seconds of this metric (default='0') -S, --spoof=STRING IP address and name of host/device (colon separated) we are spoofing (default='') -H, --heartbeat spoof a heartbeat message (use with spoof option) Gmetric gareth rushgrove | morethanseven.net
  • 25. Gmetric Scripts for Common Applications gareth rushgrove | morethanseven.net
  • 26. gmetric -n sales -v 200 -t float Gmetric Command Line gareth rushgrove | morethanseven.net
  • 27. Our Custom Metric in Ganglia gareth rushgrove | morethanseven.net
  • 28. import subprocess from bottle import route, run, abort, default_app @route('/:name/:value') def index(name, value): try: cmd = 'gmetric -n %s -v %s -t float' % (name, value) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except subprocess.CalledProcessError: abort(500, "Error") app = default_app() Gmetric HTTP Interface gareth rushgrove | morethanseven.net
  • 30. import subprocess import SocketServer class GmetricTCPHandler(SocketServer.BaseRequestHandler): def handle(self): self.data = self.request.recv(1024).strip() items = self.data.split(' ') try: cmd = 'gmetric -n %s -v %s -t float' % (items[0], items[1]) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except Exception: return "Error" if __name__ == "__main__": HOST, PORT = "0.0.0.0", 8001 server = SocketServer.TCPServer((HOST, PORT), GmetricTCPHandler) server.serve_forever() Gmetric TCP Interface gareth rushgrove | morethanseven.net
  • 31. sales 200 Gmetric TCP gareth rushgrove | morethanseven.net
  • 32. “Syslog is a standard for logging program messages. It allows separation of the software that generates messages from the system that stores them and the software that reports and analyzes them. Wikipedia Syslog gareth rushgrove | morethanseven.net
  • 33. Loggly - Logging as a Service gareth rushgrove | morethanseven.net
  • 34. View logs gareth rushgrove | morethanseven.net
  • 35. Logstash gareth rushgrove | morethanseven.net
  • 36. Graylog2 gareth rushgrove | morethanseven.net
  • 37. - Database table sizes - Cache hits - Time taken for test runs - Codebase size - Signups, sales, subscriptions - Twitter followers Other Things You Could Monitor gareth rushgrove | morethanseven.net
  • 38. - Wikipedia http://ganglia.wikimedia.org/ - Install Ganglia deb and rpm packages available - Add system metrics web servers, databases - Add business metrics users, sales, tweets - Try Loggly or at least investigate syslog What Next? gareth rushgrove | morethanseven.net
  • 39. Reading gareth rushgrove | morethanseven.net
  • 40. CBGN11 2 months free on FreeAgent gareth rushgrove | morethanseven.net
  • 41. Questions? gareth rushgrove | morethanseven.net http://flickr.com/photos/psd/102332391/