SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
What to do when Nagios notification
don't meet your needs?
You Push It
Background
Career Start
Intel - ASCII RED Supercomputer
• 1st TeraFlops Supercomputer
• Cabinets 102 - Drive & Compute clusters
• 4,536 Nodes
• 9,216 Processors (Pentium Pro’s)
• 9,216 Cores
• 1600 Square Feet
Currently
NCAR - Yellowstone Computer
• 2012: 13th with 1.5 PetaFlops, Now 50th
• 94 Cabinets - 74 Compute & 10 Drive clusters
• 4,542 Nodes
• 9,036 Processors (Intel Xeon E5-2670)
• 72,288 Cores
• 2,000 Square Feet
Nagios Configuration
Primary Instance
• Hosts - 1289
• Services - 3235
Total Instances
• Hosts - 1410
• Services - 3867
Test Instance
• Hosts - 20,007
• Services - 40,045
• Passive Results from scripts
Primary Instance
• 4 Check_MK Monitored Servers
• 5 Remote Servers sending Passive
Results
• 4 Sites being Monitored
Normal Load < 1 with 5 instances running.
Load with Test running < 4
Using OMD 1.2 (Nagios 3.5, Check_MK 1.2.4p5,
Thruk 1.84-6, PNP4Nagios 0.6.24)
Nagios Notification Configuration
Host / Service
• notification_period
– 24x7
– workhours
• contact_groups
Contact
• service_notification_period
– 24x7
– workhours
• host_notification_period
– 24x7
– workhours
• service_notification_options
– w,u,c,r,f
• host_notification_options
– d,u,r
Standard Work Week
Simple distinction between work and home.
Non-Standard Rotating Work Week
Complex and Every Week is Different.
Since we have 24x7 coverage,
why did we want notifications?
We are not always in our Operations Center at Night
• Doing nightly Visual Inspections
• Replacing hardware in the Supercomputer
• Working with facilities
• Talking with Security
• Eating a meal in our Kitchen
• Watching fireworks with facilities
• ...
Our initial Failure
No Sound from iPad Web or Apps
What We Needed
• Interface to Nagios Data
• Something to Parse for
Unacknowledged Alerts
• Something to send out Notifications
• Program to give us our alerts on our
Mobile Devices
Interface to Nagios Data
Check_MK Livestatus
• Nagios Broker Module
• Written by Mathias Kettner
• Direct Connection to Nagios through a
UNIX Socket
• No Database to administer
• No Configuration needed
• Single line needs to be added to
nagios.cfg
• Access it from the shell with unixcat
• Uses Livestatus Query Language
• http://mathias-kettner.com/checkmk_livestatus.html
Example:
root@linux# echo 'GET hosts' | unixcat /var/lib/nagios/rw/live
acknowledged;action_url;address;alias;check_command;check_
period;checks_enabled;contacts;in_check_period;in_notificatio
n_period;is_flapping;last_check
;last_state_change;name;notes;notes_url;notification_period;s
cheduled_downt
ime_depth;state;total_services
0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;Acht;ch
eck-mk-
ping;;1;check_mk,hh;1;1;0;1256194120;1255301430;Acht;;;24
X7;0;0;7
0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;DREI;ch
eck-mk-
ping;;1;check_mk,hh;1;1;0;1256194120;1255301431;DREI;;;2
4X7;0;0;1
0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;Drei;che
ck-mk-
ping;;1;check_mk,hh;1;1;0;1256194120;1255301435;Drei;;;24
X7;0;0;4
Something to Parse - Livestatus
LQL Queries
• “GET” and name of Table
• Arbitrary number of header lines
consisting of a keyword, a colon and
arguments.
• Empty line or ‘End of Transmission’
Tables
hosts services hostgroups
contacts commands servicegroups
log timeperiods contactgroups
status downtimes hostsbygroup
columns statehist comments
servicesbygroup servicesbyhostgroup
Columns
Columns: <list of column names to return in order>
Filters
Filter: <column name> <operator> <value>
Operators: =, ~, =~, ~~, <, >, <=, >=, !=, !~, !=~, !~~
Values: number, text
Combining filters
Or: <last x filters>
And: <last X filters>
Negate:
Others - Counting, Sums, Max, Min, Sd Dev, and more
Send out Notifications
Pushbullet
• Free
• Several API’s
– Android Extensions
– iPhone
– HTTP API
• https://docs.pushbullet.com
Were interested in the HTTP API, we are not
writing a custom mobile app.
HTTP API Calls
• Objects
– /v2/pushes
– /v2/devices
– /v2/contacts
– /v2/users/me
• Accounts
– /oath2
And more API calls which we don’t use.
Deliver to our Mobile Devices
Our Solution
nagios_push.sh
#!/bin/bash
# Get the person's access code for pushbullet
read AccessCode < /home/$USER/PushBulletAccessCode
# Query nagios for host alerts and send them to pushbullet
for i in $(/opt/omd/versions/1.00/bin/unixcat < /usr/local/sbin/PushBullet_query_hosts /omd/sites/noc/tmp/run/live |
tr ' ' '_' | cut -f1,2 -d';'); do
curl -u $AccessCode: https://api.pushbullet.com/v2/pushes -d type=note -d title="${i%;*}" -d body="${i#*;}" >
/dev/null 2>&1
done
# Query nagios for service alerts and send them to pushbullet
for i in $(/opt/omd/versions/1.00/bin/unixcat < /usr/local/sbin/PushBullet_query_services
/omd/sites/noc/tmp/run/live | tr ' ' '_' | cut -f1,2 -d';'); do
curl -u $AccessCode: https://api.pushbullet.com/v2/pushes -d type=note -d title="${i%;*}" -d body="${i#*;}" >
/dev/null 2>&1
done
/usr/local/sbin/PushBullet_query_hosts
GET hosts
Columns: name plugin_output state
Filter: state > 0
Filter: acknowledged = 0
Filter: host_scheduled_downtime_depth = 0
PushBullet Command Files
/usr/local/sbin/PushBullet_query_hosts
GET hosts
Columns: name plugin_output state
Filter: state > 0
Filter: acknowledged = 0
Filter: host_scheduled_downtime_depth = 0
/usr/local/sbin/PushBullet_query_services
GET services
Columns: name plugin_output state
Filter: state > 0
Filter: acknowledged = 0
Filter: scheduled_downtime_depth = 0
Our Support Scripts
npush_on
#!/bin/bash
#Make sure it is not run as root
if [ $UID -eq 0 ]
then
echo "Not to be run as root."
exit
fi
if (crontab -l|grep -q nagios_push.sh)
then
#UnComment out the crontab
crontab -l | sed -e 's/#**/4 * * * * /usr/local/sbin/nagios_push.sh/*/4 * * * * /usr/local/sbin/nagios_push.sh/'|crontab
else
#Append the item to the crontab
(crontab -l; echo "*/4 * * * * /usr/local/sbin/nagios_push.sh")|crontab
fi
#Let the user know when you are turning off the npush
hour=$(date +%H)
if [ "$hour" -lt 18 -a "$hour" -ge 6 ]; then
/usr/bin/at -f /usr/local/bin/npush_off 7pm
echo "Turning off npush at 7 PM"
else
/usr/bin/at -f /usr/local/bin/npush_off 7am
echo "Turning off npush at 7 AM"
fi
npush_off
#!/bin/bash
#Comment out the crontab
crontab -l |
sed -e 's/*/4 * * * * /usr/local/sbin/nagios_push.sh/#*/4 * * * * /usr/local/sbin/nagios_push.sh/'|
crontab
Future Upgrades
• Read Google Calendar for our schedule, no more
remembering to turn it on.
• Send email alerts to PushBullet. (Without false alerts)
• Remove the Crontab line, instead of commenting it out.
• Anything else we can think of.
Questions

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Nagios Conference 2014 - Sam Lansing - Advanced Features of Nagios XI
Nagios Conference 2014 - Sam Lansing - Advanced Features of Nagios XINagios Conference 2014 - Sam Lansing - Advanced Features of Nagios XI
Nagios Conference 2014 - Sam Lansing - Advanced Features of Nagios XI
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and TricksNagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
 
Nagios
NagiosNagios
Nagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical NagiosNagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
 
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios CoreNagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
 
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
 
Nrpe
NrpeNrpe
Nrpe
 
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your IcingaOSMC 2021 | Icinga-Installer – the easy way to your Icinga
OSMC 2021 | Icinga-Installer – the easy way to your Icinga
 
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell ScruggsOrchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
 
Nagios Conference 2014 - Jack Chu - How to Think With Nagios to Solve Monitor...
Nagios Conference 2014 - Jack Chu - How to Think With Nagios to Solve Monitor...Nagios Conference 2014 - Jack Chu - How to Think With Nagios to Solve Monitor...
Nagios Conference 2014 - Jack Chu - How to Think With Nagios to Solve Monitor...
 
Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013Chef and OpenStack Workshop from ChefConf 2013
Chef and OpenStack Workshop from ChefConf 2013
 
OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer Introduction
 
OSMC 2021 | Monitoring @ G&D
OSMC 2021 | Monitoring @ G&DOSMC 2021 | Monitoring @ G&D
OSMC 2021 | Monitoring @ G&D
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 

Andere mochten auch

Andere mochten auch (15)

Mobile apps & Server Apis, the weak link? par Emanuele Pecorari
Mobile apps & Server Apis, the weak link? par Emanuele PecorariMobile apps & Server Apis, the weak link? par Emanuele Pecorari
Mobile apps & Server Apis, the weak link? par Emanuele Pecorari
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 
Service Support Process PPT
Service Support Process PPTService Support Process PPT
Service Support Process PPT
 
Ops@viadeo : Puppet & Co... 6 mois après par Xavier Krantz
Ops@viadeo : Puppet & Co... 6 mois après par Xavier KrantzOps@viadeo : Puppet & Co... 6 mois après par Xavier Krantz
Ops@viadeo : Puppet & Co... 6 mois après par Xavier Krantz
 
Ciclo presupuestario
Ciclo presupuestarioCiclo presupuestario
Ciclo presupuestario
 
Joinder Agreement for Adding an LLC Member in New York
Joinder Agreement for Adding an LLC Member in New YorkJoinder Agreement for Adding an LLC Member in New York
Joinder Agreement for Adding an LLC Member in New York
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Opjs university ncte approved b.ed admission
Opjs university ncte approved b.ed admissionOpjs university ncte approved b.ed admission
Opjs university ncte approved b.ed admission
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Itism.v20160321.2eng public
Itism.v20160321.2eng publicItism.v20160321.2eng public
Itism.v20160321.2eng public
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Tasks pro forma
Tasks pro formaTasks pro forma
Tasks pro forma
 
OTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMCOTRS Consulting, Implementation, Customization and AMC
OTRS Consulting, Implementation, Customization and AMC
 

Ähnlich wie Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.

Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps Approach
Akshaya Mahapatra
 

Ähnlich wie Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs. (20)

Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
 
How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps Approach
 
Migrate database to Exadata using RMAN duplicate
Migrate database to Exadata using RMAN duplicateMigrate database to Exadata using RMAN duplicate
Migrate database to Exadata using RMAN duplicate
 
k8s practice 2023.pptx
k8s practice 2023.pptxk8s practice 2023.pptx
k8s practice 2023.pptx
 
Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and NovaDockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services: Neutron and Nova
 
Real World Lessons on the Pain Points of Node.js Applications
Real World Lessons on the Pain Points of Node.js ApplicationsReal World Lessons on the Pain Points of Node.js Applications
Real World Lessons on the Pain Points of Node.js Applications
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Practical Operation Automation with StackStorm
Practical Operation Automation with StackStormPractical Operation Automation with StackStorm
Practical Operation Automation with StackStorm
 
Linux Hardening - nullhyd
Linux Hardening - nullhydLinux Hardening - nullhyd
Linux Hardening - nullhyd
 
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with NagiosNagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
 
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less Coffee
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
PuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of PuppetPuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of Puppet
 
PuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of PuppetPuppetCamp SEA 1 - Use of Puppet
PuppetCamp SEA 1 - Use of Puppet
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 

Mehr von Nagios

Mehr von Nagios (10)

Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...
Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...
Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...
 
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios SolutionsNagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
Nagios Conference 2014 - Shamas Demoret - An Overview of Nagios Solutions
 
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XINagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
 
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
 
Nagios Conference 2014 - Sam Lansing - Utilizing Data Visualizations in Syste...
Nagios Conference 2014 - Sam Lansing - Utilizing Data Visualizations in Syste...Nagios Conference 2014 - Sam Lansing - Utilizing Data Visualizations in Syste...
Nagios Conference 2014 - Sam Lansing - Utilizing Data Visualizations in Syste...
 
Nagios Conference 2014 - Paloma Galan - Monitoring Financial Protocols With N...
Nagios Conference 2014 - Paloma Galan - Monitoring Financial Protocols With N...Nagios Conference 2014 - Paloma Galan - Monitoring Financial Protocols With N...
Nagios Conference 2014 - Paloma Galan - Monitoring Financial Protocols With N...
 
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
 

Kürzlich hochgeladen

Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Hung Le
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
amilabibi1
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 

Kürzlich hochgeladen (17)

Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 

Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.

  • 1. What to do when Nagios notification don't meet your needs? You Push It
  • 2. Background Career Start Intel - ASCII RED Supercomputer • 1st TeraFlops Supercomputer • Cabinets 102 - Drive & Compute clusters • 4,536 Nodes • 9,216 Processors (Pentium Pro’s) • 9,216 Cores • 1600 Square Feet Currently NCAR - Yellowstone Computer • 2012: 13th with 1.5 PetaFlops, Now 50th • 94 Cabinets - 74 Compute & 10 Drive clusters • 4,542 Nodes • 9,036 Processors (Intel Xeon E5-2670) • 72,288 Cores • 2,000 Square Feet
  • 3. Nagios Configuration Primary Instance • Hosts - 1289 • Services - 3235 Total Instances • Hosts - 1410 • Services - 3867 Test Instance • Hosts - 20,007 • Services - 40,045 • Passive Results from scripts Primary Instance • 4 Check_MK Monitored Servers • 5 Remote Servers sending Passive Results • 4 Sites being Monitored Normal Load < 1 with 5 instances running. Load with Test running < 4 Using OMD 1.2 (Nagios 3.5, Check_MK 1.2.4p5, Thruk 1.84-6, PNP4Nagios 0.6.24)
  • 4. Nagios Notification Configuration Host / Service • notification_period – 24x7 – workhours • contact_groups Contact • service_notification_period – 24x7 – workhours • host_notification_period – 24x7 – workhours • service_notification_options – w,u,c,r,f • host_notification_options – d,u,r
  • 5. Standard Work Week Simple distinction between work and home.
  • 6. Non-Standard Rotating Work Week Complex and Every Week is Different.
  • 7. Since we have 24x7 coverage, why did we want notifications? We are not always in our Operations Center at Night • Doing nightly Visual Inspections • Replacing hardware in the Supercomputer • Working with facilities • Talking with Security • Eating a meal in our Kitchen • Watching fireworks with facilities • ...
  • 8. Our initial Failure No Sound from iPad Web or Apps
  • 9. What We Needed • Interface to Nagios Data • Something to Parse for Unacknowledged Alerts • Something to send out Notifications • Program to give us our alerts on our Mobile Devices
  • 10. Interface to Nagios Data Check_MK Livestatus • Nagios Broker Module • Written by Mathias Kettner • Direct Connection to Nagios through a UNIX Socket • No Database to administer • No Configuration needed • Single line needs to be added to nagios.cfg • Access it from the shell with unixcat • Uses Livestatus Query Language • http://mathias-kettner.com/checkmk_livestatus.html Example: root@linux# echo 'GET hosts' | unixcat /var/lib/nagios/rw/live acknowledged;action_url;address;alias;check_command;check_ period;checks_enabled;contacts;in_check_period;in_notificatio n_period;is_flapping;last_check ;last_state_change;name;notes;notes_url;notification_period;s cheduled_downt ime_depth;state;total_services 0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;Acht;ch eck-mk- ping;;1;check_mk,hh;1;1;0;1256194120;1255301430;Acht;;;24 X7;0;0;7 0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;DREI;ch eck-mk- ping;;1;check_mk,hh;1;1;0;1256194120;1255301431;DREI;;;2 4X7;0;0;1 0;/nagios/pnp/index.php?host=$HOSTNAME$;127.0.0.1;Drei;che ck-mk- ping;;1;check_mk,hh;1;1;0;1256194120;1255301435;Drei;;;24 X7;0;0;4
  • 11. Something to Parse - Livestatus LQL Queries • “GET” and name of Table • Arbitrary number of header lines consisting of a keyword, a colon and arguments. • Empty line or ‘End of Transmission’ Tables hosts services hostgroups contacts commands servicegroups log timeperiods contactgroups status downtimes hostsbygroup columns statehist comments servicesbygroup servicesbyhostgroup Columns Columns: <list of column names to return in order> Filters Filter: <column name> <operator> <value> Operators: =, ~, =~, ~~, <, >, <=, >=, !=, !~, !=~, !~~ Values: number, text Combining filters Or: <last x filters> And: <last X filters> Negate: Others - Counting, Sums, Max, Min, Sd Dev, and more
  • 12. Send out Notifications Pushbullet • Free • Several API’s – Android Extensions – iPhone – HTTP API • https://docs.pushbullet.com Were interested in the HTTP API, we are not writing a custom mobile app. HTTP API Calls • Objects – /v2/pushes – /v2/devices – /v2/contacts – /v2/users/me • Accounts – /oath2 And more API calls which we don’t use.
  • 13. Deliver to our Mobile Devices
  • 15. nagios_push.sh #!/bin/bash # Get the person's access code for pushbullet read AccessCode < /home/$USER/PushBulletAccessCode # Query nagios for host alerts and send them to pushbullet for i in $(/opt/omd/versions/1.00/bin/unixcat < /usr/local/sbin/PushBullet_query_hosts /omd/sites/noc/tmp/run/live | tr ' ' '_' | cut -f1,2 -d';'); do curl -u $AccessCode: https://api.pushbullet.com/v2/pushes -d type=note -d title="${i%;*}" -d body="${i#*;}" > /dev/null 2>&1 done # Query nagios for service alerts and send them to pushbullet for i in $(/opt/omd/versions/1.00/bin/unixcat < /usr/local/sbin/PushBullet_query_services /omd/sites/noc/tmp/run/live | tr ' ' '_' | cut -f1,2 -d';'); do curl -u $AccessCode: https://api.pushbullet.com/v2/pushes -d type=note -d title="${i%;*}" -d body="${i#*;}" > /dev/null 2>&1 done
  • 16. /usr/local/sbin/PushBullet_query_hosts GET hosts Columns: name plugin_output state Filter: state > 0 Filter: acknowledged = 0 Filter: host_scheduled_downtime_depth = 0
  • 17. PushBullet Command Files /usr/local/sbin/PushBullet_query_hosts GET hosts Columns: name plugin_output state Filter: state > 0 Filter: acknowledged = 0 Filter: host_scheduled_downtime_depth = 0 /usr/local/sbin/PushBullet_query_services GET services Columns: name plugin_output state Filter: state > 0 Filter: acknowledged = 0 Filter: scheduled_downtime_depth = 0
  • 19. npush_on #!/bin/bash #Make sure it is not run as root if [ $UID -eq 0 ] then echo "Not to be run as root." exit fi if (crontab -l|grep -q nagios_push.sh) then #UnComment out the crontab crontab -l | sed -e 's/#**/4 * * * * /usr/local/sbin/nagios_push.sh/*/4 * * * * /usr/local/sbin/nagios_push.sh/'|crontab else #Append the item to the crontab (crontab -l; echo "*/4 * * * * /usr/local/sbin/nagios_push.sh")|crontab fi #Let the user know when you are turning off the npush hour=$(date +%H) if [ "$hour" -lt 18 -a "$hour" -ge 6 ]; then /usr/bin/at -f /usr/local/bin/npush_off 7pm echo "Turning off npush at 7 PM" else /usr/bin/at -f /usr/local/bin/npush_off 7am echo "Turning off npush at 7 AM" fi
  • 20. npush_off #!/bin/bash #Comment out the crontab crontab -l | sed -e 's/*/4 * * * * /usr/local/sbin/nagios_push.sh/#*/4 * * * * /usr/local/sbin/nagios_push.sh/'| crontab
  • 21. Future Upgrades • Read Google Calendar for our schedule, no more remembering to turn it on. • Send email alerts to PushBullet. (Without false alerts) • Remove the Crontab line, instead of commenting it out. • Anything else we can think of.