*sigh* today I gave my first talk to a larger audience, and you bet that made some difference. Without routine it felt quite horrible, I couldn’t safely remember the slide order (not that it would’ve helped since OOo messed them up anyway) and I even went over my time. Well anyway, I’ll put up a braindump of the questions that came up and answer them as a little FAQ.
Q: How to influence the interface check service names (aka items)
A: There’s 3 options, which are globally defined. For that reason I suggest you put them in a global config file and not i.e. with the switches themselves.
- if_inventory_uses_name (this is the default)
These option are taking effect on inventory as that’s the time where a check item is initially created. That means you need to fully reinventorize a switch if you change them.
Q: Well, but what if I got different switches and they need different inventory_uses options?
A: This will bite you – the last one will probably take precedence, BUT in fact if you switch to if_inventory_uses_description and no description is set this field usually defaults to a name like “GigabitEthernet 1/0/1 anyway. So you might be happy anyway.
But the problem has come up a few times with i.e. “real Cisco” and “Cisco Linksys” switches next to each other, so maybe it will be sorted out.
Q: Parenting – how about L2 devices
A: No CDP or similar is supported, but I suspect there’s a major difference between L2 parents and L3 parents, with the L3 routed version being more random. I didn’t think of EoMPLS and such when the question was asked, so in fact this might be more of an issue than I thought. I was mostly thinking along the lines of L2 switches that aren’t visible, but well known. These you can just define in a second file that isn’t dynamically built like Parents.mk is.
Personal note: Just use –scan-parents some more and we’ll see.
- it works very well, flawless for some users
- it covers 95% of the manual work
- it might just do what you need
- the additions you might need won’t show up unless you test it, find out if you really need them and then send-patch or have us work on it :>
Q: Parenting – is there anything similar to the old-stype Nagios host map in the Multisite gui
A: Nope, and tbh i only use the host map to visually verify the scan result. I wanna know that the core will correctly identify blocking outages or “unreachable states” and beyond that, I don’t care. I don’t see a “monitoring admin” need for a better view of that data. From an operations perspective, hell of course this would be awesome – but:
For visual representation, I think NagVis is so much better that there’s no point messing around with replacements. Besides, how about NagVis’ automaps? Also if you’re interested in more magic on that side of things (i.e. db or dns txt records driven painting), just contact Lars. :>
Q: Monitoring Standards – how about SMI-S for vendor neutral SAN device management
A: I haven’t really seen much adoption (unfortunately) myself and really feel that this standard won’t come into play for monitoring. Even if some vendor tools (HP) seem to use it via WBEM this doesn’t sound clean or anything. It would surely be possible (given unlimited time and ressources 🙂 – but feasible? not so much.
Q: Monitoring Storage: I’ve got a MSA1000 and it seems very odd to monitor / we got no idea how to monitor (there was multiple users of that old thingy)
A: I looked through the nagios portal – it seems that the environment monitoring is done via SNMP (or WBEM, see above), and the storage side of things is done inband via SCSI/FC. The good side is that it’s using the normal HP SmartArray interface and can be monitored using hpacucli (not hpasmcli as some posts suggest).
So please, don’t waste your time looking for luns via snmp, use hpacucli. If you need any modifications for that or want more H/W monitoring, then contact us, but most importantly, go with the most native tool, so start out with hpasmcli (not HP SIM lol)
A: I know of 3 reasonable (as in the best money can buy) books on SNMP. These are:
- http://www.amazon.de/SNMPv2-RMON-Practical-Network-Management/dp/0201634791/ (warning, there’s two editions of this one)
Don’t bother with any german language ones, and if you can only get one, then take the first one (pratical snmpv3).
Q: What about that NagVis map and MTA?
I would have each symbol contain one NagVis icon that represents a service group or summary state for the service.
If someone thinks that this turned into a NagVis map wouldn’t speed up error analysis, especially with the services spread over dedicated hosts for spam filtering etc. then, well, the only difference between us is that he didn’t yet make enough bad experiences :>