2. Hyves Infrastructure
3000+
Gentoo
servers
190
func;ongroups/types
3
datacenters
Database
for
server
management
2
3. Using Puppet
Since:
January
2007
Puppetmasters:
3
Loadbalanced,
1
for
CA
and
development
Version:
2.6.1
MySQL
backend
for
(thin_)storeconfigs
Nginx
+
8
Mongrel
instances
per
server
100+
modules
nodes.rb
uses
management
database
Puppet
run
every
morning
on
every
server
3
4. Nagios
8
Nagios
hosts
in
distributed
setup
2500
hosts
1
Nagios
master
for
web
and
aler;ng
Scripts
to
generate
configura;on
Management
database
for
informa;on
Templates
for
service
checks
Died
during
large
fallouts
4
5. Icinga
Switched
to
Icinga
November
2010
Distributed
Icinga
setup
doesn’t
require
centralized
host
Very
fast
standalone
Icinga-‐web
interface
Uses
database
backend
REST
API
Switching
was
easy
due
to
similar
configura;on
5
6. Current monitoring setup
Monitoring
hosts:
12
(4
per
DC)
Services:
over
83.000
Hosts:
nearly
3.500
Average
check
interval:
every
5
min
NOC
monitoring
host:
1
Overview
checks
using
API
Commandline
interface
6
7. Problems with monitoring
Adding
new
checks
meant
manually
edi;ng
a
lot
of
templates
Things
that
should
be
monitored
aren’t
Won’t
realize
it
un;l
it’s
too
late
No
monitoring
makes
it
harder
to
find
the
problem
7
8. Using Puppet to configure Icinga
Puppet
knows
it
all
so
why
not
use
that
informa;on?
Exported
resources
from
Naginator
to
define
monitoring
checks
Include
the
monitoring
defini;ons
in
profiles
Running
Puppet
defines
all
necessary
monitoring
checks
for
that
host
8
9. Example
modules/monitoring/manifests/init.pp:
class monitoring {
service { "nrpe":
ensure => running,
enable => true
} Appending $hostname in
@@nagios_host { "$hostname":
nagios_service definition to
address => $ip prevent duplicate definitions
} on monitoring hosts
@@nagios_service { "NRPE $hostname":
service_description => "NRPE",
check_command => "check_nrpe_scripts",
}
}
9
10. Example Nginx
Automa;cally
create
HTTP
checks
when
including
Nginx
modules/nginx/manifests/init.pp:
class nginx {
service { "nginx":
ensure => running,
enable => true
}
@@nagios_service { "HTTP $hostname":
service_description => "HTTP",
check_command => "check_http",
event_handler => "service_restart!nginx”,
contact_groups => “admins_email, admins_sms”
}
}
10
13. Why not Tags?
Using
“notes”
to
assign
monitoring
host
Tagging
caused
problems
when
seing
require
in
Nagios_host
and
Nagios_service
Tagging
meant
redefining,
it’s
not
inherited
Solu;on:
stages
(?)
13
15. Deploying monitoring
Deploy
script
starts
Puppet
run
on
all
monitoring
hosts
Threaded
with
small
sleep
in
between
start
to
prevent
thundering
herd
on
Puppet
masters
Waits
for
all
puppet
runs
to
finish
and
reports
whether
they
were
successful
or
not
15
16. Downsides
Puppet
run
on
Icinga
hosts
takes
about
20
minutes
(using
separate
config
files
for
each
host
helps)
Modifying
a
servicecheck
requires
a
puppet
run
on
all
hosts
with
that
servicecheck
(solu;on:
use
-‐-‐noop)
Cleaning
up
old
resources
16
18. What if something isn’t running Puppet?
Configcheck
check
Compares
management
database
with
Icinga
API
18
19. Other cool stuff
Genera;ng
daemon
checks
modules/role/lib/facter/customfacters.rb:
Facter.add("hyves_daemons") do
daemons = ["None"]
if File::exists?( "/<path_to_config>/daemons.conf" )
daemons = []
daemonarray = []
daemonconf = %x{grep name /<path_to_config>/daemons.conf}
for daemon in daemonconf
daemon.sub!(/.** name:/, '')
daemonarray.push(daemon.chomp)
end
end
setcode do
daemonarray.uniq
end
end
19
21. Other cool stuff
Genera;ng
overview
daemon
checks
require 'net/http'
module Puppet::Parser::Functions
newfunction(:get_daemons, :type => :rvalue, :docs => "
This function returns an array of all current hyves_autodaemons, based on the Icinga API
") do |args|
domain = "<domain_of_icinga_web>"
url = "/icinga-web/web/api/service/filter[AND(SERVICE_NAME%7Clike%7C*Daemon)]/columns[SERVICE_NAME]/
order[SERVICE_NAME;ASC]/authkey=<api_key>/json"
response = Net::HTTP.get_response(domain, url)
data = response.body
results = PSON.parse(data)
daemons = Array.new
results.each { |result|
daemon = result['SERVICE_NAME']
daemon.sub!(/ Daemon/, '')
daemons << daemon
}
daemons.uniq
end
end
21
22. Other cool stuff
Genera;ng
overview
daemon
checks
modules/icinga/manifests/noc.pp:
$__daemons = get_daemons()
templatefile { "/etc/icinga/puppetgenerated/other/daemons.cfg":
template => template("icinga/daemons.cfg.erb")
}
hyvesdaemons.cfg.erb:
define host{
use generic-host
host_name daemons
alias daemons
address www.hyves.nl
}
<% __daemons.each do |daemon| -%>
define service{
use DaemonOverview-check
host_name daemons
service_description <%= daemon %>
}
<% end -%>
22
23. The End
Ques%ons?
Remarks?
Ideas?
Jeffrey Lensen | System Engineer | jeffrey@hyves.nl
23