Mike Weber's presentation on Nagios rapid deployment options. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference.
2. Why: Rapid Deployment
● Installation Size
5 years ago 1000 service checks
today 150,000 to 200,000
● Admin Time Resources
● Admin Skill Levels
3. Current Options
• Core: Multiple Scripts
focus on importing not design
• XI: Auto Discovery Wizard
discovery and import hosts and ports
• XI: Bulk Host Import Wizard
clone current host with services
five field options which must be unique
10. Bulk Host Import Wizard
• Clone Host Settings
host templates
host local settings
• Import Hostgroups
• Import Description
• Import Host Parent
• No Multiple Hostgroups
• No Designated Host Template
• No Exception Management
11. Rapid Deployment Project
Efficient Design
using object inheritance
service check selection
attention to triggers
notification
troubleshooting
solutions for exceptions
Efficient Check Selection
searchable by type, os, version
scripts for download
examples in text and image
Efficient Deployment
scripted implementation for 100s of hosts and 1000s of checks
12. Design Principles
• Create Leverage
Managing from one service template per group
• Maintain Simplicity
One template equals easier troubleshooting
• Manage Exceptions
No Local Settings
13. Create Leverage
Skeleton Structure
Host → Host Template (provides all necessary settings to all similar
hosts)
→ Hostgroup ← Service Template ← Services
14. Create Leverage
#1: Create Host Templates: Core
define host {
name lx_dev_ht
alias Linux Dev Host Template
check_command check_icmp!200.0,30%!300.0,40%!!!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
contact_groups lx_dev_cg
notification_interval 300
notification_period workhours
register 0
}
16. Create Leverage
#2: Create Hostgroups: Core
define hostgroup {
hostgroup_name lx_mysql_prod_hg
alias Linux MySQL Prod Hostgroup
members lx56prodmt,lx67prodmt
}
define hostgroup {
hostgroup_name lx_prod_hg
alias Linux Prod Hostgroup
members lx56prodmt,lx67prodmt
}
24. Maintain Simplicity
• 5 Objects Model
Host Template
Hostgroup
Service Template
Services
Hosts
• Troubleshooting
No Nesting
• Training
25. Manage Exceptions
• Local Settings
Since local settings are empty you can use those settings to
override the template.
• 2 Objects Model
Service Template
Services
30. Service Check Database
• Searchable Database
Searchable by Type (WMI,NRPE,etc)
Searchable by Group (OS Metrics, SQL,Oracle,Apache, etc.)
Searchable by OS (Windows ,Linux,etc.)
Searchable by Version (Windows 2012,CentOS 6.x,etc.)
• Explanation of Variables
• Copy and Paste Scripts/Checks
• Image Example: XI
• Textual Example: Core
37. Service Check Database
Current Projects
• WMI (check_wmi_plus v.1.59 = 70 checks)
goal – discovery what is available
currently working on script to check availability of WMI classes
currently writing ini files for WMI classes
focus on 2012 servers
• Oracle Database
• SQL Database
40. Rapid Deployment: Core
The IP Addresses are the addresses for the router and switch in each
store so MRTG can build config.
IP_ADDR1=10.10.1.1 IP_ADDR=10.10.1.2 ./create_stores.sh
files/st00260.csv /usr/local/nagios/etc/objects/stores/st00260.cfg
/usr/local/nagios/etc/objects/stores/store_hostgroups.cfg
42. Rapid Deployment: Core
##### Store Templates Used to Build Configs
define host{
use kiosk
host_name HOST_NAME
alias HOST_NAME
address IP_ADDRESS
parents PARENT
}
43. Rapid Deployment: Core
define hostgroup{
hostgroup_name hg_aps
alias Store Access Points
members
}
define hostgroup{
hostgroup_name hg_pcs
alias Store PCs
members
}
define hostgroup{
hostgroup_name hg_kiosks
alias Store Kiosks
members
}
53. XI: Script
Format:
ip_address,hostname,alias,'hostgroup,hostgroup,hostgroup',host template,host parents
One Hostgroup
The hostgroups should end in "hg".
192.168.5.157,lx67prodmt,MySQL_Server,lx_prod_hg,lx_prod_ht,,
Two Hostgroups (Note the use of ' to enclose the hostgroups)
192.168.5.157,lx67prodmt,MySQL_Server,'lx_prod_hg,lx_mysql_prod_hg',lx_prod_ht,
Two Hostgroups and Parents Setting
192.168.5.157,lx67prodmt,MySQL_Server,'lx_prod_hg,lx_mysql_prod_hg',lx_prod_ht,cisco_sm300
54. XI: Script
Hostgroups
lx_prod_hg Linux Production Hostgroup
lx_dev_hg Linux Development Hostgroup
Host Templates
lx_prod_ht Linux Production Host Template
lx_dev_ht Linux Development Host Template
Service Templates
lx_os_prod_st Linux OS Metrics Production Service Template
lx_mysql_prod_stLinux MySQL Production Service Template
lx_apache_prod_st Linux Apache Production Service Template
lx_defense_prod_st Linux Defensive Metrics Production Service Template
lx_os_dev_st Linux OS Metrics Development Service Template
lx_mysql_dev_st Linux MySQL Development Service Template
lx_apache_dev_stLinux Apache Development Service Template
lx_defense_dev_st Linux Defensive Metrics Development Service Template
55. XI: Script
Services
lx_cpu_os Linux CPU
lx_mem_os Linux Memory
lx_root_part_os Linux / Partition
lx_home_part_os Linux /boot Partition
lx_processes_os Linux Processes
lx_ssh_os Linux SSH
lx_load_os Linux Load
lx_files_os Linux Files
lx_users_os Linux Users
lx_cron_os Linux Cron
lx_rsyslog_os Linux rsyslog
lx_connect_time_mysql MySQL Connect Time
lx_open_conn_mysql MySQL Open Connections
lx_thr_cache_hit_mysql MySQL Thread Cache Hit Ratio
lx_tab_cache_hit_mysql MySQL Table Cache Hit Rate
lx_slow_queries_mysql MySQL Slow Queries
lx_long_run_proc_mysql MySQL Long Running Processes
lx_uptime_mysql MySQL Uptime
56. XI: Script
Contact Groups
lx_prod_cg Linux Production Contact Group
lx_dev_cg Linux Development Contact Group
lx_mysql_prod_cg Linux MySQL Production Contact Group
lx_mysql_dev_cg Linux MySQL Development Contact Group
lx_apache_prod_cg Linux Apache Production Contact Group
lx_apache_dev_cg Linux Apache Development Contact Group
lx_defense_cg Linux Defense Contact Group
57. XI: Script
Output:
Running preflight
check on configuration data...
Checking objects...
Checked 311 services.
...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay No
serious problems were detected during the preflight
check
RET: 0
Running configuration check...done.
Stopping nagios: .done.
Starting nagios: done.
58. Script Downloads
Rapid Deployment Scripts:
Beginlinuxservers.com/nagiosconf
User: nagiosconference
Password: 54TBwh9
Only available during conference.