Mike Guthrie's presentation on distributed monitoring solutions for Nagios. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
8. In 1 second: About 16 scripts or binary processes are being launched, with about 16 sets of results coming in and being processed by Nagios and written to disk.
9. When the check schedule exceeds CPU limitations, you get “check latency”
10.
11.
12. A 30mn task on 1 server = 5 hours on 10 servers.
13. Consider how to effectively view information across multiple machines
14. As data quantity increases, discerning useful information from it becomes more important
15. Viewing 10,000 hosts and 50,000 services on a page is too much raw data to be effective information
16. The Classic Distributed Model Central Server (Passive Only) Active Checks Distributed servers running active checks, forwarding results to a central server Active Checks Active Checks Active Checks Active Checks Active Checks Active Checks Active Checks Forward Results After Every Check
84. An environment that implemented 250k services being overseen by a single server took almost an entire year of planning and implementation to do it right