- Did you ever want to automate server maintenance?
But you couldn’t? because you know bad things will happen if the server isn’t in the right state, but there is no way to find out the status?
- Do you hate logging into your ticket system just to see whats broken?
- Do you hate being unable to quickly verify if your fixes worked?
- Did you ever wish to be able to simply check “is the server i’m working on OK?
And do you think this should all be possible without interrupting your actual work, because if this weren’t a critical server and something one needs to concentrate on … there would be an imp doing it instead of you? Yes?
You know this nagging feeling that says:
This should really be easier.
How about doing all that and some more from now on?
Add this to your scripts to fetch the most current server status:
$ ./cmkstatus || echo "cannot do maintenance, server is unhappy"
cannot do maintenance, server is unhappy
Use “total” to get total numbers for the server status – match that against the number your ticket system has.
$ ./cmkstatus total
View all the current issues!
$ ./cmkstatus show
Database Error UNKNOWN
Database mysql UNKNOWN
Database status UNKNOWN
LOG /var/log/mail.warn WARN
LOG /var/log/messages WARN
MySQL Status Slow_queries CRITICAL
More to come – how about cmkstatus recheck and a cmkstatus maintenance that creates a downtime from now till 10 minutes+20 minutes flex?
also how about running this against special service groups that just show a servers unix-level status, so no alerts for an application that you stopped, but an alert for the /var filesystem that will burst if you patch the server now?
cmkstatus is dedicated to my old offshore team, they’ll recognize part of this. 🙂
check_mk allowed me to go a lot further with less effort – i remember it used to take up to 10 mins to schedule a downtime on OVO, now the same thing will take a few milliseconds 🙂