This document discusses using sFlow for scalable, unified monitoring of networks, systems, and applications. sFlow exports standard performance counters from network devices, hosts, and applications. It uses a lightweight "push" protocol over UDP that is scalable and cloud-friendly. In addition to counters, sFlow also exports random packet samples which provide insight into issues like top URLs, clients, and servers without high overhead. Tagged, a social networking company, uses sFlow for comprehensive monitoring across their infrastructure through integration with tools like Ganglia.
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and Applications
1. The sFlow standard:
scalable, unified
monitoring of networks,
systems and applications
Dave Mangot (Tagged Inc.)
tech.mangot.com
Peter Phaal (InMon Corp.)
blog.sflow.com
2.
3. Tagged Inc.
Social Networking
5 billion page views a month
4 TB of main memcached
Heavy use of Apache/PHP and Java
Ganglia critical to business function
Puppet for configuration management
4. InMon Corp.
Performance management software developer
Originators of the sFlow standard
Founding member of sFlow.org
Initial implementation and contributor to Host sFlow and related projects
- Memcached sFlow patch
- Apache mod_sflow
- NGINX sFlow module
- sFlow Java Agent
Contributed sFlow support to Ganglia project
5. Challenge: Monitoring large, scale-out, multi-tiered sites
Load Memcache
Balancer Server
Web
Server
Balancer Server
Network Application Database
Server
Large number of servers in each pool
Servers constantly being added/removed
Network performance is critical
- scale-out applications dependent on network performance
- potential for propagating failures between tiers
6. Challenge: Monitoring large, scale-out, multi-tiered sites
Load Memcache
Balancer Server
Web
Server
Balancer Server
Network Application Database
Server
Large number of servers in each pool
Servers constantly being added/removed
Network performance is critical
- scale-out applications dependent on network performance
- potential for propagating failures between tiers
7. sFlow is the industry standard for monitoring switches
10. sFlow’s scalable “push” protocol
Simple
- standard structures - densely packed blocks of counters
- extensible (tag, length, value)
- RFC 1832: XDR encoded (big endian, quad-aligned, binary) - simple to encode/decode
- unicast UDP transport
Minimal configuration
- collector address
- polling interval
Cloud friendly
- flat, two tier architecture: many embedded agents → central “smart” collector
- sFlow agents automatically start sending metrics on startup, automatically discovered
- eliminates complexity of maintaining polling daemons (and their associated configurations)
11. Example
Collect 50 metrics per server
Every 30 seconds
From 100,000 servers
100,000 / 30 ≈ 3,333 sFlow datagrams per second
12. Example
Collect 50 metrics per server
Every 30 seconds
From 100,000 servers
100,000 / 30 ≈ 3,333 sFlow datagrams per second
Single sFlow analyzer can monitor entire data center!
13. Counters aren’t enough
Counters tell you there is a problem, but
not why.
Counters summarize performance by
dropping high cardinality attributes:
- IP addresses
- URLs
- Memcache keys
Need to be able to efficiently disaggregate
counter by attributes in order to
understand root cause of performance
problems.
How do you get this data when there are
Why the spike in traffic? millions of transactions per second?
(100Gbit link carrying 14,000,000 packets/second)
14. sFlow also exports random samples
Random sampling is lightweight
- critical path roughly cost of maintaining one
counter:
if(--skip == 0) sample();
- sampling is easy to distribute among
modules, threads, processes without any
synchronization
- minimal resources required to capture
attributes of sampled transactions
Easily identify top keys, connections,
clients, servers, URLs etc.
Unbiased results with known accuracy
Break out traffic by client, server and port
(graph based on samples from100Gbit link carrying 14,000,000 packets/second)
15. Big Picture: Comprehensive, multi-layer visibility
Apache/PHP Memcached
Applications Tomcat/Java
Virtual Servers
Virtual Network
Servers
Network
Embedded monitoring of all switches, Consistent measurements shared
all servers, all applications, all the time between multiple management tools
16. Tagged Uses sFlow!
Apache via mod_sflow
Java via sflowagent (-agent sflowagent.jar)
Memcached via source patches
Host sFlow
17. sFlow + Ganglia
make a much better graphic
integration with Ganglia
deployed via Puppet
34. Thanks!
The Ganglia Team
The SiteOps team @ Tagged & Tagged Inc.
Bay Area LSPE Meetup - actually meeting tonight!
TubeMogul
PayPal
O’Reilly
http://clipart-for-free.blogspot.com/2008/06/free-truck-
clipart.html
35. Questions?
We are also doing office hours
today @ 2:30 in the exhibit hall!
Editor's Notes
* sounds funny to say “standard”\n* repeatable & consistent, transport and approach, apply to each instrumented protocol\n
* met Peter giving a talk on Graphite, needed metrics for Graphs\n* “Who here has networking gear by a vendor not named Cisco? Who here has used sFlow on their switches or Hosts?”\n* my history with sflow\n* super easy to integrate in Graphite talk with a little Perl\n* Tagged fanatical about monitoring, Peter wanted to validate approach so it is a good match (he is also fanatical about sflow), so much so, I asked him if he’d thought about supporting node.js, had it working the next day!\n* You can ask me about the 404s in the hall or at office hours\n\n
* been relying on OPEN SOURCE Host sFlow almost 1 year\n1) automatic visibility into applications\n2) network more efficient (PPS)\n* Welcome Peter\n\n
* one of the authors of the sFlow standard\n* InMon develops performance management software, \n* contributes to sFlow related projects\n* introduction to sFlow, put context behind examples Dave will present\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* diagram is typical of scale-out, multi-tier “cloud” architectures like Tagged’s\n* server pools ensure high availability and allow capacity to be adjusted to demand\n* size and dynamic nature of cloud architecture makes it a challenge to monitor\n* unusual to show network: often ignored, complexity hidden behind APIs\n* scale-out application performance tightly coupled to network\n* network shared between tiers, can propagate failures\n* a basic problem is lack of network visibility - request timeout, congestion vs. failure\n* network visibility reveals dependencies and congested resources\n
* switch vendors embed instrumentation in their hardware\n* sFlow standard was developed by switch vendors to ensure interoperability\n* today, most vendors support sFlow\n* network visibility is a matter of selecting devices with sFlow support\n* recently, sFlow standard extended to include server and application performance\n\n\n
* Host sFlow is an open source agent that exports server metrics\n* core of an ecosystem of related open source projects\n* integrate monitoring into an increasing range of applications\n* seen current scope of sFlow implementations\n* let’s take a look at types of measurement that sFlow provides\n
* don’t worry - I don’t expect you to read this slide\n* counters are a staple of network and system management \n* counters are maintained by switch hardware, operating systems and applications \n* counters aren’t useful if they are stranded within each device \n* sFlow provides an efficient way to collect counters from large numbers of devices\n* makes performance information actionable\n
* sFlow is a simple protocol\n* each of the blocks of counters from previous slide are efficiently encoded using XDR\n* sent as UDP datagrams to an sFlow analyzer\n* each datagram can carry hundreds of individual metrics\n* minimal configuration: IP address of the collector and a polling interval. \n* cloud environments: hosts constantly added, removed, started and stopped\n* challenge: maintaining lists of devices to poll for statistics\n* sFlow: each device automatically sends metrics as soon it starts up\n* devices immediately detected and continuously monitored\n
* example: 50 metrics, every 30 seconds from 100,000 servers\n* three thousand sFlow datagrams per second\n* easily decoded and processed\n* storing and querying takes a little more effort\n* easily managed by a single server\n
* metrics are extremely useful for characterizing performance\n* operations dashboards covered with trend charts\n* trend chart summarizes vast amounts of information\n* example: chart for link carrying over 14 million packets per second\n* nearly 1 billion packets are summarized in each data point shown on the graph \n* detect a spike - where do you go next?\n
* random sampling is an integral part of sFlow monitoring\n* overhead of maintaining one additional counter\n* details of transaction attributes, data volumes, response times and status codes \n* example: 3 network connections showing client, server, protocol and traffic* understand the increase in traffic and plan actions\n* sampling applies equally to HTTP requests, Memcache operations etc.\n* Dave will be presenting additional examples later in this talk\n* stepping back: sFlow allows pervasive instrumentation of the data center\n
* embedding instrumentation reduces operational complexity\n* deploys with services\n* ensures all resources continuously monitored\n* integrated view of applications and server/network resources they depend on\n* e.g. drop in Memcache throughput: misconfigured client, swapping, packet loss\n* standardizing metrics breaks the dependency between agents and tools\n* consistent reporting across analysis tools\n* consistent metrics across agents: e.g. web statistics from Apache, Tomcat or NGINX\n* Dave will describe Tagged’s experiences with deploying and using sFlow\n
* Cisco switches/routers with Netflow\n* Some SNMP done by polling, but polling for metrics sucks\n\n\n
* Questions about this diagram or your own diagram, find me after, be happy to go over it with you.\n* integration with later versions of Ganglia, scale ganglia normally\n* deployed via Puppet, all ERB templates fed by CMDB\n* can send data to as many places as you want, can send it to a collector and then into a message bus like Kafka, db, whatever.\n* UDP joke\n
* Our first example is with HTTP. \n* HTTP can be from anything that speaks HTTP, Nginx, Node.js, Tomcat, Apache same metrics. \n* tool that consumes your HTTP metrics, write it once,standard, repeatable, information flow even if you switch from Apache to NGINX. \n* No text log parsing, all streamed in realtime to you \n
* entire stack, cpu, network, application \n* I/O wait on CPU for storage, fronted by CDN.\n* can see network traffic and HTTP metrics \n* some 404s, banned content?\n* static assets tier, ALL GET requests, no POSTs.\n\n
* Lots of GETs and POSTs\n* Easy to make rollup graph in Ganglia, few lines JSON\n* Updated every 15 seconds, comprehend, refresh\n\n
* individual URI performance\n* top 15 URI paths by Time\n* UPLOAD longest, makes sense, upload pictures\n* Pets, most popular game\n* Work with devs, faster pages, more revenue, happier pointy haired people\n* DevOps collaboration\n \n
* Not just duration, ops/sec, bytes/sec\n* graphs on bottom updated every minute\n* can see prevalence of URIs in graph, can even click in this tool\n\n
* previously only STATS command\n* STATS SIZES locks entire cache\n* non-invasive granular instrumentation used to require Gear6 Advanced Reporter\n* Some memcache patches, get streamed to us, hits, misses, etc, also indiv. keys\n\n\n\n
* cold cache ramp, top to bottom view of instances\n* could have CPU, file desc, whatever\n* Cache rapidly achieves steady state\n* GET hits rapidly overwhelm GET misses\n\n
Numbers in legend\n * 6 instances\n * throughput on startup\n * after steady state orders mag more read\n * saving database\n * what memcache adds to your db\n
* Not just metrics from STATS command, actual data\n* # ops/sec on top 15 hottest keys\n* Just like HTTP, durations, throughput as well\n* look at MISSES, try to explain\n\n\n
* Anyone monitor Java apps?\n* Tomcat example, could be any Java process\n* Used to be jstat -gc or poller \n* Drop in JAR, restart, visible in Ganglia or wherever\n* Elasticsearch and Logstash (jruby)\n* drop in visibility \n
* used to get nagios alerts every few days\n* heap builds over days\n\n
* Not just heap, easy to make rollup of any metrics with some JSON \n
* “Has anyone here used TCPdump?”\n* “How many people know Perl, Ruby, Python?”\n* Have all the tools you need to utilize sflowtool\n\n
* take raw data from network, do what you want\n* familiar if used tcpdump\n* reads from network, presents in human or computer consumable\n* get data you see in ganglia charts, plus URLs, memcache keys, etc.\n\n
* drop data into mongodb like you can see on my GitHub account or a CEP like Esper, write an input plugin for logstash, up to you\n* aggregate and send to statsd or graphite? No problem\n* the example building block good tools give you to allow you to do what YOU imagine\n* would encourage you to join the community and take advantage\n\n