3. What is your name?
Ops/DevOps
What is your quest?
A healthy and functioning server
environment
What is… your favourite time to
be woken up by an alert?
Erh.. Never.. Erh office hou
aaaaaaaaaah!
4. Example Checklist for monitoring vs alerting
1. What am I actually monitoring, service or details of
part of a service (hardware, connectivity, etc)?
If the answer isn’t a service, then it might not be fit
for alerting
2. Will I get fired for not responding directly to this?
(aka is it business critical)
If the answer to that is yes, maybe its fit for
alerting
3. Is what I want to check possibly covered by
another alert or monitoring check? E.g.
connectivity is also answered by a check for if / in
the webroot can be requested, in combination with
answering if the webserver is properly configured
5. And for the love of all that is compiling, keep all the monitoring and alerting in _ONE_
system!
(if you need/want to use more than one, atleast aggregate the end result to _ONE_
system)
6. Arguing about how to monitor/alert or with
what takes away from the fact that you’re
not that interested in that the monitoring is
done but what information it gives you.
AKA
Even someone pressing F5 in their browser
to verify that the site is up is more
informative than: Server has 1337 MB RAM
free, / is 42% free and there is 1
(management ;) ) zombie process