Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 33 Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (20)

Anzeige

Ähnlich wie Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016 (20)

Weitere von Zabbix (16)

Anzeige

Aktuellste (20)

Alexander Naydenko - Nagios to Zabbix Migration | ZabConf2016

  1. 1. How we learned to stop worrying and love Zabbix Nagios->Zabbix migration
  2. 2. Some background info on our monitoring ecosystem
  3. 3. How we evolved: 5 companies, < 100 hardware items Various tools (FPinger, scripts) 30 companies, 10 engineers, =<1000 hardware items Linux/Windows/Solaris/*BSD/Cisco/…/… Nagios (NCSA/NRPE/NagiosGraph/NagVis) and MOM 100+ companies, 20+ engineers, >=5000 hardware items Linux/Windows/Solaris/*BSD/Cisco/VoIP/Security Centreon
  4. 4. Why Centreon? • ACL • Distributed monitoring • Nagios backend and plugins • Not so ugly • Web configuration
  5. 5. Things to monitor • *nix and Windows (50-100+ sensors per host) • VoIP infrastructure(H/W, S/W, trunks and calls) • Databases and applications • Network equipment (interfaces, connectivity etc) • Supplementary systems (cameras, temperature sensors, UPSes etc.)
  6. 6. How it worked Monitoring automation Server infrastructure
  7. 7. Problems • Management • Not enough information and features • Performance problems • NCSA/NRDP limitations • need to use a lot of active checks • script performance • Basic agents (NRPE/NCPA) • WMI check performance • Nagios.cfg limitations • Broker management issues (SSH+scripts) • Not enough SLA • Too many different services to integrate together (brokers, processors, scripts, etc)
  8. 8. Monitoring management problems
  9. 9. Plain nagios architecture
  10. 10. Extended nagios architecture (Nagios XI)
  11. 11. My case (Centreon)
  12. 12. +CLAPI +Nagvis +Puppet +home-made scripts +hacks +many more
  13. 13. Synchronization issues Performance data shipping issues Any configuration change => config rebuild + engine reload Data is distributed across different files and databases Interaction is based on sockets, files and scripts
  14. 14. Automation problems in action
  15. 15. Agents and checks
  16. 16. Nagios doesn’t decide if check result is ok or not. Script does
  17. 17. Nagios doesn’t decide if check result is ok or not. Script does. Nothing is built in Only scripts No proper authentication No decent agents History is tracked in per-check files
  18. 18. Performance Interpreted languages (Perl, Python, bash, whatever..) Exit codes
  19. 19. NOT ENOUGH SLA CRITICAL WARNING OK FLAPPING
  20. 20. UI
  21. 21. UI
  22. 22. We decided to give Zabbix a try
  23. 23. We decided to give Zabbix a try
  24. 24. Less than a day to setup
  25. 25. Less than a day to setup a distributed installation
  26. 26. Less than a week to setup most of the sensors
  27. 27. It took less than two months to deploy it on 90% of the monitored infrastructures • 1218 servers • 519 network devices • over 100k items
  28. 28. Migration process Deploy: puppet + SQL-scripts Checks: google and github Built-in wmi.get! Built-in LLD! Built-in inventory! TLS authentication! JSON-API! Lot more
  29. 29. How it works now Monitoring automation Server infrastructure
  30. 30. Comparison
  31. 31. What we like • No more hassle adding new hosts • Much more information • Finally decent reports • Better performance • 3 times less resources required
  32. 32. Zabbix is friendly)
  33. 33. Questions?

×