SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Welcome




           Jeffrey	
  Lensen
          System	
  Engineer




                               1
Hyves Infrastructure




                            3000+	
  Gentoo	
  servers
                          190	
  func;ongroups/types
                                   3	
  datacenters
                       Database	
  for	
  server	
  management




                                                                 2
Using Puppet




                                  Since:	
  January	
  2007
                   Puppetmasters:	
  3	
  Loadbalanced,	
  
                         1	
  for	
  CA	
  and	
  development
                                      Version:	
  2.6.1
                MySQL	
  backend	
  for	
  (thin_)storeconfigs
                Nginx	
  +	
  8	
  Mongrel	
  instances	
  per	
  server
                                      100+	
  modules
                 nodes.rb	
  uses	
  management	
  database
               Puppet	
  run	
  every	
  morning	
  on	
  every	
  server




                                                                            3
Nagios




          8	
  Nagios	
  hosts	
  in	
  distributed	
  setup
                                2500	
  hosts
         1	
  Nagios	
  master	
  for	
  web	
  and	
  aler;ng
              Scripts	
  to	
  generate	
  configura;on
         Management	
  database	
  for	
  informa;on
                Templates	
  for	
  service	
  checks

                  Died	
  during	
  large	
  fallouts




                                                                 4
Icinga




                       Switched	
  to	
  Icinga	
  November	
  2010

         Distributed	
  Icinga	
  setup	
  doesn’t	
  require	
  centralized	
  host
                 Very	
  fast	
  standalone	
  Icinga-­‐web	
  interface
                              Uses	
  database	
  backend
                                        REST	
  API
              Switching	
  was	
  easy	
  due	
  to	
  similar	
  configura;on




                                                                                       5
Current monitoring setup




                             Monitoring	
  hosts:	
  12	
  (4	
  per	
  DC)
                                 Services:	
  over	
  83.000
                                   Hosts:	
  nearly	
  3.500
                           Average	
  check	
  interval:	
  every	
  5	
  min
                                NOC	
  monitoring	
  host:	
  1
                               Overview	
  checks	
  using	
  API
                                Commandline	
  interface




                                                                                6
Problems with monitoring




     Adding	
  new	
  checks	
  meant	
  manually	
  edi;ng	
  a	
  lot	
  of	
  templates

                    Things	
  that	
  should	
  be	
  monitored	
  aren’t
                        Won’t	
  realize	
  it	
  un;l	
  it’s	
  too	
  late
             No	
  monitoring	
  makes	
  it	
  harder	
  to	
  find	
  the	
  problem




                                                                                             7
Using Puppet to configure Icinga



                                    Puppet	
  knows	
  it	
  all	
  
                           so	
  why	
  not	
  use	
  that	
  informa;on?

                  Exported	
  resources	
  from	
  Naginator	
  to	
  define	
  
                                  monitoring	
  checks

                  Include	
  the	
  monitoring	
  defini;ons	
  in	
  profiles

               Running	
  Puppet	
  defines	
  all	
  necessary	
  monitoring	
  
                                checks	
  for	
  that	
  host




                                                                                   8
Example


 modules/monitoring/manifests/init.pp:

 class monitoring {
    service { "nrpe":
       ensure => running,
       enable => true
    }                                           Appending $hostname in
     @@nagios_host { "$hostname":
                                                nagios_service definition to
       address => $ip                           prevent duplicate definitions
     }                                              on monitoring hosts
     @@nagios_service { "NRPE $hostname":
       service_description => "NRPE",
       check_command => "check_nrpe_scripts",
     }
 }




                                                                               9
Example Nginx


         Automa;cally	
  create	
  HTTP	
  checks	
  when	
  including	
  Nginx

                    modules/nginx/manifests/init.pp:

                    class nginx {
                       service { "nginx":
                          ensure => running,
                          enable => true
                       }

                        @@nagios_service { "HTTP $hostname":
                          service_description => "HTTP",
                          check_command => "check_http",
                          event_handler => "service_restart!nginx”,
                          contact_groups => “admins_email, admins_sms”
                        }
                    }




                                                                                  10
Predefining and distributing




manifests/defines.pp:

$__notifications_enabled = $systemstatus ? {
    operational => "1",
    fail => "0"
}

Nagios_host {                                        Nagios_service {
  ensure => present,                                   ensure => present,
  host_name => $hostname.$domain,                      host_name => $hostname.$domain,
  hostgroups => $role,                                 use => "generic-service", #our standard template
  use => "generic-host", #our standard template        notifications_enabled => $__notifications_enabled,
  alias => $hostname,                                  target => "/etc/icinga/puppetgenerated/services/
  notifications_enabled => $__notifications_enabled,   $hostname.cfg",
  target => "/etc/icinga/puppetgenerated/hosts/        notes => $monitoringhost
$hostname.cfg",                                      }
  notes => $monitoringhost
}




                                                                                                     11
Retrieving exported resources




         modules/icinga/manifests/init.pp:

         class icingacollect {
            Nagios_host <<| notes == "$hostname" |>> {
               require => File["/etc/icinga/puppetgenerated/hosts"]
            }
            Nagios_service <<| notes == "$hostname" |>> {
               require => File["/etc/icinga/puppetgenerated/services"]
            }
         }




                                                                         12
Why not Tags?




                      Using	
  “notes”	
  to	
  assign	
  monitoring	
  host

                Tagging	
  caused	
  problems	
  when	
  seing	
  require	
  in	
  
                            Nagios_host	
  and	
  Nagios_service

                   Tagging	
  meant	
  redefining,	
  it’s	
  not	
  inherited	
  

                                    Solu;on:	
  stages	
  (?)




                                                                                      13
Fail-safes

     modules/icinga/manifests/init.pp:

     class icinga {
        include icingacollect

         exec { "verify new cfg":
           command => "/usr/bin/icinga -v /etc/icinga/verify-puppetgenerated.cfg",
           require => Class["icingacollect"]
         }

         exec { "mv cfgs":
           command => "rm -rf /etc/icinga/puppet/* ; mv /etc/icinga/puppetgenerated/* /etc/icinga/puppet/",
           require => Exec["verify new cfg"]
         }

         exec { "restart icinga":
           command => ""/usr/bin/printf '[] RESTART_PROGRAMn' > /var/icinga/rw/icinga.cmd"",
           require => [
              Exec["mv cfgs"],
              Service["icinga"]
           ]
         }
     }




                                                                                                              14
Deploying monitoring




          Deploy	
  script	
  starts	
  Puppet	
  run	
  on	
  all	
  monitoring	
  hosts

          Threaded	
  with	
  small	
  sleep	
  in	
  between	
  start	
  to	
  prevent	
  
                      thundering	
  herd	
  on	
  Puppet	
  masters

          Waits	
  for	
  all	
  puppet	
  runs	
  to	
  finish	
  and	
  reports	
  whether	
  
                                   they	
  were	
  successful	
  or	
  not




                                                                                                  15
Downsides




            Puppet	
  run	
  on	
  Icinga	
  hosts	
  takes	
  about	
  20	
  minutes
             (using	
  separate	
  config	
  files	
  for	
  each	
  host	
  helps)

      Modifying	
  a	
  servicecheck	
  requires	
  a	
  puppet	
  run	
  on	
  all	
  hosts	
  
                                with	
  that	
  servicecheck	
  
                               (solu;on:	
  use	
  -­‐-­‐noop)

                               Cleaning	
  up	
  old	
  resources




                                                                                                   16
Cleaning up




   $fqdn = $host_to_be_removed.$domain
   puppet apply
      --certname $fqdn
      --node_name facter
      --thin_storeconfigs
      $dbsettings
      --execute 'resources { ["nagios_service","nagios_host"]: purge => true }'




                                                                                  17
What if something isn’t running Puppet?




                              Configcheck	
  check
               Compares	
  management	
  database	
  with	
  Icinga	
  API




                                                                             18
Other cool stuff

                               Genera;ng	
  daemon	
  checks
                   modules/role/lib/facter/customfacters.rb:

                   Facter.add("hyves_daemons") do
                     daemons = ["None"]
                     if File::exists?( "/<path_to_config>/daemons.conf" )
                         daemons = []
                         daemonarray = []
                         daemonconf = %x{grep name /<path_to_config>/daemons.conf}
                         for daemon in daemonconf
                            daemon.sub!(/.** name:/, '')
                            daemonarray.push(daemon.chomp)
                         end
                     end
                     setcode do
                         daemonarray.uniq
                     end
                   end




                                                                                    19
Other cool stuff

                              Genera;ng	
  daemon	
  checks

              modules/daemons/manifests/init.pp:

              class daemons {
                 define add_daemon_check {
                    @@nagios_service { "$name Daemon $hostname":
                       use => "Daemon-check",
                       service_description => "$name Daemon",
                       check_command => "check_daemon!$name"
                    }
                 }

                   add_daemon_check { $hyves_daemons: }
              }




                                                                   20
Other cool stuff

                            Genera;ng	
  overview	
  daemon	
  checks
require 'net/http'

module Puppet::Parser::Functions
  newfunction(:get_daemons, :type => :rvalue, :docs => "
     This function returns an array of all current hyves_autodaemons, based on the Icinga API
  ") do |args|

     domain = "<domain_of_icinga_web>"
     url = "/icinga-web/web/api/service/filter[AND(SERVICE_NAME%7Clike%7C*Daemon)]/columns[SERVICE_NAME]/
order[SERVICE_NAME;ASC]/authkey=<api_key>/json"
     response = Net::HTTP.get_response(domain, url)
     data = response.body
     results = PSON.parse(data)
     daemons = Array.new
     results.each { |result|
        daemon = result['SERVICE_NAME']
        daemon.sub!(/ Daemon/, '')
        daemons << daemon
     }

    daemons.uniq
  end
end




                                                                                                       21
Other cool stuff

                     Genera;ng	
  overview	
  daemon	
  checks
                   modules/icinga/manifests/noc.pp:

                   $__daemons = get_daemons()
                   templatefile { "/etc/icinga/puppetgenerated/other/daemons.cfg":
                     template => template("icinga/daemons.cfg.erb")
                   }

                   hyvesdaemons.cfg.erb:

                   define host{
                     use             generic-host
                     host_name       daemons
                     alias           daemons
                     address         www.hyves.nl
                   }

                   <% __daemons.each do |daemon| -%>
                   define service{
                     use                  DaemonOverview-check
                     host_name            daemons
                     service_description   <%= daemon %>
                   }
                   <% end -%>



                                                                                    22
The End




          Ques%ons?	
  
          Remarks?
          Ideas?
           Jeffrey Lensen | System Engineer | jeffrey@hyves.nl

                                                                 23

Weitere ähnliche Inhalte

Was ist angesagt?

alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfrescosedemo
 
Puppet for Developers
Puppet for DevelopersPuppet for Developers
Puppet for Developerssagarhere4u
 
From Dev to DevOps
From Dev to DevOpsFrom Dev to DevOps
From Dev to DevOpsAgile Spain
 
Scalable Systems Management with Puppet
Scalable Systems Management with PuppetScalable Systems Management with Puppet
Scalable Systems Management with PuppetPuppet
 
PuppetCamp SEA 1 - Puppet Deployment at OnApp
PuppetCamp SEA 1 - Puppet Deployment  at OnAppPuppetCamp SEA 1 - Puppet Deployment  at OnApp
PuppetCamp SEA 1 - Puppet Deployment at OnAppWalter Heck
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoringTiago Simões
 
PuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of PuppetPuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of PuppetWalter Heck
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubTiago Simões
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 
Getting started with Ansible
Getting started with AnsibleGetting started with Ansible
Getting started with AnsibleIvan Serdyuk
 
Automated Java Deployments With Rpm
Automated Java Deployments With RpmAutomated Java Deployments With Rpm
Automated Java Deployments With RpmMartin Jackson
 
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기NAVER D2
 
How to configure a hive high availability connection with zeppelin
How to configure a hive high availability connection with zeppelinHow to configure a hive high availability connection with zeppelin
How to configure a hive high availability connection with zeppelinTiago Simões
 
How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera clusterTiago Simões
 
Making Your Capistrano Recipe Book
Making Your Capistrano Recipe BookMaking Your Capistrano Recipe Book
Making Your Capistrano Recipe BookTim Riley
 
Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點William Yeh
 
Take control of your Jenkins jobs via job DSL.
Take control of your Jenkins jobs via job DSL.Take control of your Jenkins jobs via job DSL.
Take control of your Jenkins jobs via job DSL.Łukasz Proszek
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Actionjuvenxu
 

Was ist angesagt? (20)

alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6
 
Puppet for Developers
Puppet for DevelopersPuppet for Developers
Puppet for Developers
 
From Dev to DevOps
From Dev to DevOpsFrom Dev to DevOps
From Dev to DevOps
 
Scalable Systems Management with Puppet
Scalable Systems Management with PuppetScalable Systems Management with Puppet
Scalable Systems Management with Puppet
 
PuppetCamp SEA 1 - Puppet Deployment at OnApp
PuppetCamp SEA 1 - Puppet Deployment  at OnAppPuppetCamp SEA 1 - Puppet Deployment  at OnApp
PuppetCamp SEA 1 - Puppet Deployment at OnApp
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 
PuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of PuppetPuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of Puppet
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHub
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
Puppet_training
Puppet_trainingPuppet_training
Puppet_training
 
Getting started with Ansible
Getting started with AnsibleGetting started with Ansible
Getting started with Ansible
 
Automated Java Deployments With Rpm
Automated Java Deployments With RpmAutomated Java Deployments With Rpm
Automated Java Deployments With Rpm
 
Puppet fundamentals
Puppet fundamentalsPuppet fundamentals
Puppet fundamentals
 
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기
[Hello world 오픈세미나]varnish로 웹서버성능 향상시키기
 
How to configure a hive high availability connection with zeppelin
How to configure a hive high availability connection with zeppelinHow to configure a hive high availability connection with zeppelin
How to configure a hive high availability connection with zeppelin
 
How to create a secured cloudera cluster
How to create a secured cloudera clusterHow to create a secured cloudera cluster
How to create a secured cloudera cluster
 
Making Your Capistrano Recipe Book
Making Your Capistrano Recipe BookMaking Your Capistrano Recipe Book
Making Your Capistrano Recipe Book
 
Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點
 
Take control of your Jenkins jobs via job DSL.
Take control of your Jenkins jobs via job DSL.Take control of your Jenkins jobs via job DSL.
Take control of your Jenkins jobs via job DSL.
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Action
 

Ähnlich wie Distributed monitoring at Hyves- Puppet

Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAkshaya Mahapatra
 
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenOSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenNETWAYS
 
Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Icinga
 
NGINX Can Do That? Test Drive Your Config File!
NGINX Can Do That? Test Drive Your Config File!NGINX Can Do That? Test Drive Your Config File!
NGINX Can Do That? Test Drive Your Config File!Jeff Anderson
 
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)M Malai
 
Ansible is Our Wishbone
Ansible is Our WishboneAnsible is Our Wishbone
Ansible is Our WishboneMydbops
 
One click deployment
One click deploymentOne click deployment
One click deploymentAlex Su
 
From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012Carlos Sanchez
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with PuppetKris Buytaert
 
Puppet Deployment at OnApp
Puppet Deployment at OnApp Puppet Deployment at OnApp
Puppet Deployment at OnApp Puppet
 
PuppetCamp SEA 1 - Puppet Deployment at OnApp
PuppetCamp SEA 1 - Puppet Deployment  at OnAppPuppetCamp SEA 1 - Puppet Deployment  at OnApp
PuppetCamp SEA 1 - Puppet Deployment at OnAppOlinData
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with PuppetKris Buytaert
 
Using puppet
Using puppetUsing puppet
Using puppetAlex Su
 
Pyramid Deployment and Maintenance
Pyramid Deployment and MaintenancePyramid Deployment and Maintenance
Pyramid Deployment and MaintenanceJazkarta, Inc.
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)DECK36
 
Ansible Tutorial.pdf
Ansible Tutorial.pdfAnsible Tutorial.pdf
Ansible Tutorial.pdfNigussMehari4
 
Harmonious Development: Via Vagrant and Puppet
Harmonious Development: Via Vagrant and PuppetHarmonious Development: Via Vagrant and Puppet
Harmonious Development: Via Vagrant and PuppetAchieve Internet
 

Ähnlich wie Distributed monitoring at Hyves- Puppet (20)

Chef solo the beginning
Chef solo the beginning Chef solo the beginning
Chef solo the beginning
 
Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps Approach
 
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey LensenOSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
OSMC 2011 | Case Study - Icinga at Hyves.nl by Jeffrey Lensen
 
Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015
 
NGINX Can Do That? Test Drive Your Config File!
NGINX Can Do That? Test Drive Your Config File!NGINX Can Do That? Test Drive Your Config File!
NGINX Can Do That? Test Drive Your Config File!
 
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
 
Ansible is Our Wishbone
Ansible is Our WishboneAnsible is Our Wishbone
Ansible is Our Wishbone
 
One click deployment
One click deploymentOne click deployment
One click deployment
 
Storage managment using nagios
Storage managment using nagiosStorage managment using nagios
Storage managment using nagios
 
From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with Puppet
 
Puppet Deployment at OnApp
Puppet Deployment at OnApp Puppet Deployment at OnApp
Puppet Deployment at OnApp
 
PuppetCamp SEA 1 - Puppet Deployment at OnApp
PuppetCamp SEA 1 - Puppet Deployment  at OnAppPuppetCamp SEA 1 - Puppet Deployment  at OnApp
PuppetCamp SEA 1 - Puppet Deployment at OnApp
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Using puppet
Using puppetUsing puppet
Using puppet
 
Pyramid Deployment and Maintenance
Pyramid Deployment and MaintenancePyramid Deployment and Maintenance
Pyramid Deployment and Maintenance
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
 
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
 
Ansible Tutorial.pdf
Ansible Tutorial.pdfAnsible Tutorial.pdf
Ansible Tutorial.pdf
 
Harmonious Development: Via Vagrant and Puppet
Harmonious Development: Via Vagrant and PuppetHarmonious Development: Via Vagrant and Puppet
Harmonious Development: Via Vagrant and Puppet
 

Mehr von Puppet

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyamlPuppet
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)Puppet
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscodePuppet
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twentiesPuppet
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codePuppet
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approachPuppet
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationPuppet
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliancePuppet
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Puppet
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppetPuppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkPuppet
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping groundPuppet
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy SoftwarePuppet
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User GroupPuppet
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsPuppet
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyPuppet
 

Mehr von Puppet (20)

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyaml
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscode
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twenties
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance code
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approach
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliance
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden Windows
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael Pinson
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin Reeuwijk
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User Group
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOps
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
 

Distributed monitoring at Hyves- Puppet

  • 1. Welcome Jeffrey  Lensen System  Engineer 1
  • 2. Hyves Infrastructure 3000+  Gentoo  servers 190  func;ongroups/types 3  datacenters Database  for  server  management 2
  • 3. Using Puppet Since:  January  2007 Puppetmasters:  3  Loadbalanced,   1  for  CA  and  development Version:  2.6.1 MySQL  backend  for  (thin_)storeconfigs Nginx  +  8  Mongrel  instances  per  server 100+  modules nodes.rb  uses  management  database Puppet  run  every  morning  on  every  server 3
  • 4. Nagios 8  Nagios  hosts  in  distributed  setup 2500  hosts 1  Nagios  master  for  web  and  aler;ng Scripts  to  generate  configura;on Management  database  for  informa;on Templates  for  service  checks Died  during  large  fallouts 4
  • 5. Icinga Switched  to  Icinga  November  2010 Distributed  Icinga  setup  doesn’t  require  centralized  host Very  fast  standalone  Icinga-­‐web  interface Uses  database  backend REST  API Switching  was  easy  due  to  similar  configura;on 5
  • 6. Current monitoring setup Monitoring  hosts:  12  (4  per  DC) Services:  over  83.000 Hosts:  nearly  3.500 Average  check  interval:  every  5  min NOC  monitoring  host:  1 Overview  checks  using  API Commandline  interface 6
  • 7. Problems with monitoring Adding  new  checks  meant  manually  edi;ng  a  lot  of  templates Things  that  should  be  monitored  aren’t Won’t  realize  it  un;l  it’s  too  late No  monitoring  makes  it  harder  to  find  the  problem 7
  • 8. Using Puppet to configure Icinga Puppet  knows  it  all   so  why  not  use  that  informa;on? Exported  resources  from  Naginator  to  define   monitoring  checks Include  the  monitoring  defini;ons  in  profiles Running  Puppet  defines  all  necessary  monitoring   checks  for  that  host 8
  • 9. Example modules/monitoring/manifests/init.pp: class monitoring { service { "nrpe": ensure => running, enable => true } Appending $hostname in @@nagios_host { "$hostname": nagios_service definition to address => $ip prevent duplicate definitions } on monitoring hosts @@nagios_service { "NRPE $hostname": service_description => "NRPE", check_command => "check_nrpe_scripts", } } 9
  • 10. Example Nginx Automa;cally  create  HTTP  checks  when  including  Nginx modules/nginx/manifests/init.pp: class nginx { service { "nginx": ensure => running, enable => true } @@nagios_service { "HTTP $hostname": service_description => "HTTP", check_command => "check_http", event_handler => "service_restart!nginx”, contact_groups => “admins_email, admins_sms” } } 10
  • 11. Predefining and distributing manifests/defines.pp: $__notifications_enabled = $systemstatus ? { operational => "1", fail => "0" } Nagios_host { Nagios_service { ensure => present, ensure => present, host_name => $hostname.$domain, host_name => $hostname.$domain, hostgroups => $role, use => "generic-service", #our standard template use => "generic-host", #our standard template notifications_enabled => $__notifications_enabled, alias => $hostname, target => "/etc/icinga/puppetgenerated/services/ notifications_enabled => $__notifications_enabled, $hostname.cfg", target => "/etc/icinga/puppetgenerated/hosts/ notes => $monitoringhost $hostname.cfg", } notes => $monitoringhost } 11
  • 12. Retrieving exported resources modules/icinga/manifests/init.pp: class icingacollect { Nagios_host <<| notes == "$hostname" |>> { require => File["/etc/icinga/puppetgenerated/hosts"] } Nagios_service <<| notes == "$hostname" |>> { require => File["/etc/icinga/puppetgenerated/services"] } } 12
  • 13. Why not Tags? Using  “notes”  to  assign  monitoring  host Tagging  caused  problems  when  seing  require  in   Nagios_host  and  Nagios_service Tagging  meant  redefining,  it’s  not  inherited   Solu;on:  stages  (?) 13
  • 14. Fail-safes modules/icinga/manifests/init.pp: class icinga { include icingacollect exec { "verify new cfg": command => "/usr/bin/icinga -v /etc/icinga/verify-puppetgenerated.cfg", require => Class["icingacollect"] } exec { "mv cfgs": command => "rm -rf /etc/icinga/puppet/* ; mv /etc/icinga/puppetgenerated/* /etc/icinga/puppet/", require => Exec["verify new cfg"] } exec { "restart icinga": command => ""/usr/bin/printf '[] RESTART_PROGRAMn' > /var/icinga/rw/icinga.cmd"", require => [ Exec["mv cfgs"], Service["icinga"] ] } } 14
  • 15. Deploying monitoring Deploy  script  starts  Puppet  run  on  all  monitoring  hosts Threaded  with  small  sleep  in  between  start  to  prevent   thundering  herd  on  Puppet  masters Waits  for  all  puppet  runs  to  finish  and  reports  whether   they  were  successful  or  not 15
  • 16. Downsides Puppet  run  on  Icinga  hosts  takes  about  20  minutes (using  separate  config  files  for  each  host  helps) Modifying  a  servicecheck  requires  a  puppet  run  on  all  hosts   with  that  servicecheck   (solu;on:  use  -­‐-­‐noop) Cleaning  up  old  resources 16
  • 17. Cleaning up $fqdn = $host_to_be_removed.$domain puppet apply --certname $fqdn --node_name facter --thin_storeconfigs $dbsettings --execute 'resources { ["nagios_service","nagios_host"]: purge => true }' 17
  • 18. What if something isn’t running Puppet? Configcheck  check Compares  management  database  with  Icinga  API 18
  • 19. Other cool stuff Genera;ng  daemon  checks modules/role/lib/facter/customfacters.rb: Facter.add("hyves_daemons") do daemons = ["None"] if File::exists?( "/<path_to_config>/daemons.conf" ) daemons = [] daemonarray = [] daemonconf = %x{grep name /<path_to_config>/daemons.conf} for daemon in daemonconf daemon.sub!(/.** name:/, '') daemonarray.push(daemon.chomp) end end setcode do daemonarray.uniq end end 19
  • 20. Other cool stuff Genera;ng  daemon  checks modules/daemons/manifests/init.pp: class daemons { define add_daemon_check { @@nagios_service { "$name Daemon $hostname": use => "Daemon-check", service_description => "$name Daemon", check_command => "check_daemon!$name" } } add_daemon_check { $hyves_daemons: } } 20
  • 21. Other cool stuff Genera;ng  overview  daemon  checks require 'net/http' module Puppet::Parser::Functions newfunction(:get_daemons, :type => :rvalue, :docs => " This function returns an array of all current hyves_autodaemons, based on the Icinga API ") do |args| domain = "<domain_of_icinga_web>" url = "/icinga-web/web/api/service/filter[AND(SERVICE_NAME%7Clike%7C*Daemon)]/columns[SERVICE_NAME]/ order[SERVICE_NAME;ASC]/authkey=<api_key>/json" response = Net::HTTP.get_response(domain, url) data = response.body results = PSON.parse(data) daemons = Array.new results.each { |result| daemon = result['SERVICE_NAME'] daemon.sub!(/ Daemon/, '') daemons << daemon } daemons.uniq end end 21
  • 22. Other cool stuff Genera;ng  overview  daemon  checks modules/icinga/manifests/noc.pp: $__daemons = get_daemons() templatefile { "/etc/icinga/puppetgenerated/other/daemons.cfg": template => template("icinga/daemons.cfg.erb") } hyvesdaemons.cfg.erb: define host{ use generic-host host_name daemons alias daemons address www.hyves.nl } <% __daemons.each do |daemon| -%> define service{ use DaemonOverview-check host_name daemons service_description <%= daemon %> } <% end -%> 22
  • 23. The End Ques%ons?   Remarks? Ideas? Jeffrey Lensen | System Engineer | jeffrey@hyves.nl 23