I’ve done a cleanup of the xen check for nagios / check_mk. The old one did not correctly handle VMs that were down and would confuse your main check_mk active check.
You can find the current version at: https://bitbucket.org/darkfader/nagios/src/8c95fbe779f2
This is now sorted out and working very well, even including “unstarted” VMs. I did most of the test using libvirtd though and now disabled it to investigate run time etc. The local agent plugin can use a lot of work too. My script skeleton only takes 0.059s to run on a very slow host. But every call to “xm” takes about 0.55 seconds.
It seems even less a python issue than just slowness of the Xenstore and was well documented in the following list post to xen-devel. Seems Daniel tried to wake up people for two times but without success. I’ll try to verify the xenstore performance issue and try using the ramdisk hack, too.
Thats how it looks like on a 1.5GHz box:
[root@davexh0001 ~]# time for i in `seq 1 1000` ; do virsh list > /dev/null ; done
real 29m36.072suser 0m8.509ssys 0m15.545s
On the “making things better” track, I’ve also written to the xen-devel list with a lot of questions which will hopefully help me implement the next few features.