SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Effective Monitoring for 
Demanding Operations 
Environments 
Rodrigue Chakode 
Nagios World Conference, Saint-Paul, MN, US 
2013-10-01
Background 
● Service : generic term to refer an IT functionality (e.g. mysqld service) 
● Business Service/Process : a service provided value-added to 
business applications or to end-users (e.g. hosting service) 
● Check: a probe allowing to detect the status of an IT service (e.g. 
check for mysqld service) 
● Abbreviations 
– BS: Business Service 
– BP: Business Process 
– BSM: Business Service Management 
– OSM: Open Source Monitoring 
– OSMS : Open Source Monitoring System/Software
Basic Monitoring Scheme 
Flat Display, no notion 
of business impact !
“Too many alarms kill alarm” 
S. Bortzmeye
Today's IT infrastructures facts 
● Huge number of checks to handle 
– E.g. 100 hosts, 8 checks/host => 8,00 checks 
● False alerts are the bane of administrators 
– Not a matter of being a lazy admin 
No way for operators to be effective with flat 
display !
Challenges for effective monitoring 
● How a failure actually impacts your business ?
Is there a disruption of services? 
RAID 0 
(striping) 
RAID 1 
(mirroring)
“prioritize and orchestrate work based on business 
needs” http://www.bmc.com/solutions/bsm/
Go beyond individual checks 
● Think business services 
– A failure don't necessarily mean disruptions on 
business applications or end-user services 
● Benefits of BSM 
– Reduce downtime by up to 75% 
– Deliver services up to 30% more efficiently 
– Credit: http://www.bmc.com/solutions/bsm/
Think relational services 
● A business service may depend on : 
– one or many IT services, and/or on 
– other business services 
– E.g. Streaming ← Web Server ← Databases ← 
Network ← Operating System ← Hardware 
Devices...
Service hierarchy and mapping
Service hierarchy and mapping 
Service map ISN'T 
Network map
Apply flexible incident management 
● Only select checks that impact your business 
services 
● Apply advanced severity calculation 
● Set how the severity of a node is computed from on 
the severities of its childs 
– And advanced status propagation rules 
● Set how the severity of a node is propagated to its 
parent
Use cases 
● RAID 0 ● RAID 1 
● Redundant databases ● Merchant-site
Specialize your Operations Dashboards 
● Business service-centric/competency-centric 
● Deal with large/demanding environments 
– Just collect what is useful for each dashboard 
● Get insight in one shot
“takes the IT you already have, and adds to it 
the visibility and control of a unified platform” 
http://www.bmc.com/
Existing options 
● Basic features 
– Nagios BP Add-on, Shinken Business Rules 
– No service map, basic aggregation rules 
– Handle a huge number of services could be tricky
RealOpInsight 
● Powerful Dashboard Toolkit for BSM 
– Generic and versatile add-on supporting many OSM 
tools 
● Qt-based GUI application 
– Powerful and friendly interfaces 
– Cross platform (Linux, Windows, Mac OS X) 
● http://realopinsight.com 
“small and efficient and gets the job done” 
lukaswhite, SourceForget.net
Some Features 
● Effective Operations Management 
– Prioritize incidents based on business impact 
● Advanced customizable event processing rules 
– avg, high impact, decrease, increase... 
● Distributed monitoring made easy 
– Versatile, supports up to 10 monitoring backends simultaneously 
● Free, Open Source and Cross-platform 
– Windows, Linux, OS X 
● More comprehensive messages 
– e.g. “the CPU load on server <IP/hostname> is more than <threshold> 
percent 
● System Tray Notifications
Tree View, Map and Events in one 
Console 
Service Tree 
● Tooltips 
● Focus 
● Service-related 
message 
filtering... 
Service Mapping 
● Tooltips, Zooming, Dragging 
and Scrolling, Focus, Service-related 
message filtering... 
Message Console 
● Trouble view filtering, Large 
font mode
Advanced Incident Management 
● Severity 
aggregation 
● Severity increasing 
● Severity decreasing 
● ...
Simple and Efficient Design 
● Service Views as XML files 
● Native WYSIWYG Editor 
● Dynamic Operations 
Console 
● Simple Integration
Distributed Monitoring/Unified Dashboard 
● Loosely-coupled scalable architecture 
– Status data retrieved through RPC APIs
Ngrt4nd-based Integration - How To 
● Specific daemon on Nagios server 
– See documentation 
● Relies on status.data 
● ZeroMQ-based RPC APIs 
– Authenticated data retrieving 
● Non recommended 
– Non-scalable, delayed status data,
Livestatus-based Integration - How To 
● Xinetd TCP-based RPC over a native UNIX 
socket 
– Xinetd socket over the Livestatus NEB socket 
– /etc/xinetd.d/livestatus 
● Restart Xinetd 
– /etc/init.d/xinetd restart 
● Recommended 
– NEB, scalable, up-to-date data
Source Settings 
Ngrt4nd 
– Monitor Web URL (optional) 
– Auth String 
– Server address 
– Listening port (1983 by default) 
– “Use Livestatus” must be disabled 
Livestatus 
– Monitor Web URL (optional) 
– Server address 
– Listening port 
– “Use Livestatus” must be enabled
Getting started in 3 steps 
● Run the Editor 
… and edit your service view configuration 
● Run the Configuration Manager 
… and set the access to the remote API 
● Run the Operations Console 
… and load the configuration file 
● Then fall in love!
Integration with Nagios 
Service in Nagios 
Service selection in RealOpInsight 
SourceId:]host_name[/service_description] 
Set sources and API access 
ngrt4nd/Livestatus
History: Experience Feedback 1/2 
● 2008 : the Idea 
● May 2010 : 1st lines of code 
● March 2011 (1st release, 1.0) 
– <30 downloads a month 
● May - August 2012 (version 2.0) 
– New architecture, GPLv3 License 
– SourceForge.net, Nagios Exchange 
– Windows Installer 
– 200 downloads a month
History: Experience Feedback 2/2 
● December 2012 (v2.1) 
– Continuous packaging for openSUSE, Fedora and Ubuntu 
● March 2013 (v2.2) 
– 600 downloads a month 
● May 2013 (v2.3) 
– Support for Livestatus API 
● July - September 2013 
– Nagios Affiliate 
– v2.4, adding support of distributed environments 
● Today 
– 7k+ downloads from 120+ countries
And the story continues..., Thanks 
● Web Edition (2014) 
@realopinsight

Weitere ähnliche Inhalte

Andere mochten auch

Quick Start Guide to Managed Services
Quick Start Guide to Managed ServicesQuick Start Guide to Managed Services
Quick Start Guide to Managed ServicesRichard Tubb
 
Repousser les limites de l'agilité
Repousser les limites de l'agilitéRepousser les limites de l'agilité
Repousser les limites de l'agilitéPascal Poussard
 
Une stratégie basée sur les préférences client, Jean-Louis Nicque
Une stratégie basée sur les préférences client, Jean-Louis NicqueUne stratégie basée sur les préférences client, Jean-Louis Nicque
Une stratégie basée sur les préférences client, Jean-Louis NicqueInstitut Lean France
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
 
Agile et Lean : des univers convergents ? par Dimitri Baeli
Agile et Lean : des univers convergents ? par Dimitri BaeliAgile et Lean : des univers convergents ? par Dimitri Baeli
Agile et Lean : des univers convergents ? par Dimitri BaeliInstitut Lean France
 
Service delivery and Project management
Service delivery and Project managementService delivery and Project management
Service delivery and Project managementMasaf Dawood
 
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.Dané
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.DanéL’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.Dané
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.DanéInstitut Lean France
 
Les managers face au déploiement du Lean par MC Boutonnet, Philips
Les managers face au déploiement du Lean par MC Boutonnet, PhilipsLes managers face au déploiement du Lean par MC Boutonnet, Philips
Les managers face au déploiement du Lean par MC Boutonnet, PhilipsInstitut Lean France
 
Lean, stratégie et résultats par Catherine Chabiron
Lean, stratégie et résultats par Catherine Chabiron Lean, stratégie et résultats par Catherine Chabiron
Lean, stratégie et résultats par Catherine Chabiron Institut Lean France
 
Manage services presentation
Manage services presentationManage services presentation
Manage services presentationLen Moncrieffe
 
Le Lean en ingénierie par Cécile Roche de Thales
Le Lean en ingénierie par Cécile Roche de ThalesLe Lean en ingénierie par Cécile Roche de Thales
Le Lean en ingénierie par Cécile Roche de ThalesInstitut Lean France
 
Managed Services is not a product, it's a business model!
Managed Services is not a product, it's a business model!Managed Services is not a product, it's a business model!
Managed Services is not a product, it's a business model!Stuart Selbst Consulting
 
Etre Lean dans la durée par Pierre Vareille et Yves Merel
Etre Lean dans la durée par Pierre Vareille et Yves MerelEtre Lean dans la durée par Pierre Vareille et Yves Merel
Etre Lean dans la durée par Pierre Vareille et Yves MerelInstitut Lean France
 
Service delivery governance
Service delivery governanceService delivery governance
Service delivery governanceMasaf Dawood
 
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCF
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCFDu Lean en maintenance ferroviaire par Boris Evesque de la SNCF
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCFInstitut Lean France
 
New frontiers: Lean in the digital age by Daniel T Jones
New frontiers: Lean in the digital age by Daniel T JonesNew frontiers: Lean in the digital age by Daniel T Jones
New frontiers: Lean in the digital age by Daniel T JonesInstitut Lean France
 

Andere mochten auch (17)

Quick Start Guide to Managed Services
Quick Start Guide to Managed ServicesQuick Start Guide to Managed Services
Quick Start Guide to Managed Services
 
Repousser les limites de l'agilité
Repousser les limites de l'agilitéRepousser les limites de l'agilité
Repousser les limites de l'agilité
 
Une stratégie basée sur les préférences client, Jean-Louis Nicque
Une stratégie basée sur les préférences client, Jean-Louis NicqueUne stratégie basée sur les préférences client, Jean-Louis Nicque
Une stratégie basée sur les préférences client, Jean-Louis Nicque
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
 
Agile et Lean : des univers convergents ? par Dimitri Baeli
Agile et Lean : des univers convergents ? par Dimitri BaeliAgile et Lean : des univers convergents ? par Dimitri Baeli
Agile et Lean : des univers convergents ? par Dimitri Baeli
 
Service delivery and Project management
Service delivery and Project managementService delivery and Project management
Service delivery and Project management
 
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.Dané
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.DanéL’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.Dané
L’engagement du dirigeant au cœur de la démarche Lean par C.Riboulet et C.Dané
 
Les managers face au déploiement du Lean par MC Boutonnet, Philips
Les managers face au déploiement du Lean par MC Boutonnet, PhilipsLes managers face au déploiement du Lean par MC Boutonnet, Philips
Les managers face au déploiement du Lean par MC Boutonnet, Philips
 
Lean, stratégie et résultats par Catherine Chabiron
Lean, stratégie et résultats par Catherine Chabiron Lean, stratégie et résultats par Catherine Chabiron
Lean, stratégie et résultats par Catherine Chabiron
 
Manage services presentation
Manage services presentationManage services presentation
Manage services presentation
 
Le Lean en ingénierie par Cécile Roche de Thales
Le Lean en ingénierie par Cécile Roche de ThalesLe Lean en ingénierie par Cécile Roche de Thales
Le Lean en ingénierie par Cécile Roche de Thales
 
Managed Services is not a product, it's a business model!
Managed Services is not a product, it's a business model!Managed Services is not a product, it's a business model!
Managed Services is not a product, it's a business model!
 
Etre Lean dans la durée par Pierre Vareille et Yves Merel
Etre Lean dans la durée par Pierre Vareille et Yves MerelEtre Lean dans la durée par Pierre Vareille et Yves Merel
Etre Lean dans la durée par Pierre Vareille et Yves Merel
 
Service delivery governance
Service delivery governanceService delivery governance
Service delivery governance
 
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCF
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCFDu Lean en maintenance ferroviaire par Boris Evesque de la SNCF
Du Lean en maintenance ferroviaire par Boris Evesque de la SNCF
 
New frontiers: Lean in the digital age by Daniel T Jones
New frontiers: Lean in the digital age by Daniel T JonesNew frontiers: Lean in the digital age by Daniel T Jones
New frontiers: Lean in the digital age by Daniel T Jones
 
Service Delivery Management (Lucia Eversley)
Service Delivery Management (Lucia Eversley)Service Delivery Management (Lucia Eversley)
Service Delivery Management (Lucia Eversley)
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Kürzlich hochgeladen (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Effective Monitoring For Demanding Operations Environments

  • 1. Effective Monitoring for Demanding Operations Environments Rodrigue Chakode Nagios World Conference, Saint-Paul, MN, US 2013-10-01
  • 2. Background ● Service : generic term to refer an IT functionality (e.g. mysqld service) ● Business Service/Process : a service provided value-added to business applications or to end-users (e.g. hosting service) ● Check: a probe allowing to detect the status of an IT service (e.g. check for mysqld service) ● Abbreviations – BS: Business Service – BP: Business Process – BSM: Business Service Management – OSM: Open Source Monitoring – OSMS : Open Source Monitoring System/Software
  • 3. Basic Monitoring Scheme Flat Display, no notion of business impact !
  • 4. “Too many alarms kill alarm” S. Bortzmeye
  • 5. Today's IT infrastructures facts ● Huge number of checks to handle – E.g. 100 hosts, 8 checks/host => 8,00 checks ● False alerts are the bane of administrators – Not a matter of being a lazy admin No way for operators to be effective with flat display !
  • 6. Challenges for effective monitoring ● How a failure actually impacts your business ?
  • 7. Is there a disruption of services? RAID 0 (striping) RAID 1 (mirroring)
  • 8. “prioritize and orchestrate work based on business needs” http://www.bmc.com/solutions/bsm/
  • 9. Go beyond individual checks ● Think business services – A failure don't necessarily mean disruptions on business applications or end-user services ● Benefits of BSM – Reduce downtime by up to 75% – Deliver services up to 30% more efficiently – Credit: http://www.bmc.com/solutions/bsm/
  • 10. Think relational services ● A business service may depend on : – one or many IT services, and/or on – other business services – E.g. Streaming ← Web Server ← Databases ← Network ← Operating System ← Hardware Devices...
  • 12. Service hierarchy and mapping Service map ISN'T Network map
  • 13. Apply flexible incident management ● Only select checks that impact your business services ● Apply advanced severity calculation ● Set how the severity of a node is computed from on the severities of its childs – And advanced status propagation rules ● Set how the severity of a node is propagated to its parent
  • 14. Use cases ● RAID 0 ● RAID 1 ● Redundant databases ● Merchant-site
  • 15. Specialize your Operations Dashboards ● Business service-centric/competency-centric ● Deal with large/demanding environments – Just collect what is useful for each dashboard ● Get insight in one shot
  • 16. “takes the IT you already have, and adds to it the visibility and control of a unified platform” http://www.bmc.com/
  • 17. Existing options ● Basic features – Nagios BP Add-on, Shinken Business Rules – No service map, basic aggregation rules – Handle a huge number of services could be tricky
  • 18. RealOpInsight ● Powerful Dashboard Toolkit for BSM – Generic and versatile add-on supporting many OSM tools ● Qt-based GUI application – Powerful and friendly interfaces – Cross platform (Linux, Windows, Mac OS X) ● http://realopinsight.com “small and efficient and gets the job done” lukaswhite, SourceForget.net
  • 19. Some Features ● Effective Operations Management – Prioritize incidents based on business impact ● Advanced customizable event processing rules – avg, high impact, decrease, increase... ● Distributed monitoring made easy – Versatile, supports up to 10 monitoring backends simultaneously ● Free, Open Source and Cross-platform – Windows, Linux, OS X ● More comprehensive messages – e.g. “the CPU load on server <IP/hostname> is more than <threshold> percent ● System Tray Notifications
  • 20. Tree View, Map and Events in one Console Service Tree ● Tooltips ● Focus ● Service-related message filtering... Service Mapping ● Tooltips, Zooming, Dragging and Scrolling, Focus, Service-related message filtering... Message Console ● Trouble view filtering, Large font mode
  • 21. Advanced Incident Management ● Severity aggregation ● Severity increasing ● Severity decreasing ● ...
  • 22. Simple and Efficient Design ● Service Views as XML files ● Native WYSIWYG Editor ● Dynamic Operations Console ● Simple Integration
  • 23. Distributed Monitoring/Unified Dashboard ● Loosely-coupled scalable architecture – Status data retrieved through RPC APIs
  • 24. Ngrt4nd-based Integration - How To ● Specific daemon on Nagios server – See documentation ● Relies on status.data ● ZeroMQ-based RPC APIs – Authenticated data retrieving ● Non recommended – Non-scalable, delayed status data,
  • 25. Livestatus-based Integration - How To ● Xinetd TCP-based RPC over a native UNIX socket – Xinetd socket over the Livestatus NEB socket – /etc/xinetd.d/livestatus ● Restart Xinetd – /etc/init.d/xinetd restart ● Recommended – NEB, scalable, up-to-date data
  • 26. Source Settings Ngrt4nd – Monitor Web URL (optional) – Auth String – Server address – Listening port (1983 by default) – “Use Livestatus” must be disabled Livestatus – Monitor Web URL (optional) – Server address – Listening port – “Use Livestatus” must be enabled
  • 27. Getting started in 3 steps ● Run the Editor … and edit your service view configuration ● Run the Configuration Manager … and set the access to the remote API ● Run the Operations Console … and load the configuration file ● Then fall in love!
  • 28. Integration with Nagios Service in Nagios Service selection in RealOpInsight SourceId:]host_name[/service_description] Set sources and API access ngrt4nd/Livestatus
  • 29. History: Experience Feedback 1/2 ● 2008 : the Idea ● May 2010 : 1st lines of code ● March 2011 (1st release, 1.0) – <30 downloads a month ● May - August 2012 (version 2.0) – New architecture, GPLv3 License – SourceForge.net, Nagios Exchange – Windows Installer – 200 downloads a month
  • 30. History: Experience Feedback 2/2 ● December 2012 (v2.1) – Continuous packaging for openSUSE, Fedora and Ubuntu ● March 2013 (v2.2) – 600 downloads a month ● May 2013 (v2.3) – Support for Livestatus API ● July - September 2013 – Nagios Affiliate – v2.4, adding support of distributed environments ● Today – 7k+ downloads from 120+ countries
  • 31. And the story continues..., Thanks ● Web Edition (2014) @realopinsight