Linux LVM mirroring comes at a price


You can find some nice article about clvm mirroring here http://www.joshbryan.com/blog/2008/01/02/lvm2-mirrors-vs-md-raid-1

A reader had already tried to warn people but I think it went unheard

LVM is not safe in a power failure, it does not respect write barriers and pass those down to the lower drives.

hence, it is often faster than MD by default, but to be safe you would have to turn off your drive’s write caches, which ends up making it slower than if you used write barriers.

First of all, he’s right. More on that below. Also I find it kinda funny how he goes into turning off write caches. I was under the impression that NOONE is crazy enough to have write caches enabled in their servers, unless they’re battery backed and the local disk is only used for swap anyway. I mean, that was the one guy who at least know about the barrier issue and he thinks it’s safe to run with his cache turned on.

All the pretty little linux penguins look soooo much faster – as long as we just disable all those safeguards that people built into unix over the last 20 years 🙂

Anyway, back to LVM mirrors!

We just learned: All devicemapper based IO layers in Linux can/will lose barriers.

Furthermore LVM2 has it’s own set of issues, and it’s important to chose wisely – I think these are the most notable items that can give you lots of trouble in a mirror scenario:

  • no sophisticated mirror write consistency (and worse, people who are using –corelog)
  • only trivial mirror policies
  • no good per LE-PE sync status handling
  • (no PV keys either? – PV keys are used to hash LE-PE mappings independent of PVID)
  • limited number of mirrors (this can turn into a problem if you wanna move data with added redundancy during the migration)
  • no safe physical volume status handling
  • too many userspace components that will work fine as long as everything is ok but can die on you if something is broken
  • no reliable behaviour on quorum loss (VG should not activate, optionally the server should panic upon quorum loss, but at LEAST vgchange -a y should be able to re-establish the disks once their back). I sometimes wonder if the LVM2 even knows a quorum?!!
  • On standard distros nothing hooks into the lvm2 udev event handlers, so there are no reliable monitors for your status. Besides, the lvm2 monitors suck seem to be still in a proof-of-concept state…

since barriers are simply dropped in the devicemapper (not in LVM btw) you should chose wisely whether to use lvm2 mirrors for critical data mirroring.

Summary:

  • LVM mirror may look faster, but it comes at a price
  • Things tend to be slower if they do something the proper way.

Of course, if you’re using LVM on top of MD you *also* lose barriers.

Usually we can all live pretty well with either of those settings, but we should be aware there are problems and that we opted managability / performace over integrity.

Personally I’ll see the management advantages of LVM as high enough to accept the risk of FS corruption. I think the chance of losing data is much higher when I manually mess around with fdisk or parted and MD on every occasion I add a disk etc.

If it were very critical data you can either replicate in the storage array (without LVM and multipath??????) or scratch up the money for a Veritas FS/Volume Manger license (unless you’re a Xen user like me… 😦 )

either way…:

SET UP THE MONITORING.

 

A little update here:

According to the LVM article on wikipedia.com the kernels from 2.6.31 do handle barriers correctly even with LVM. On the downside that article only covers Linux LVM and imho has a lot of factual errors, so I’m not sure I’ll just go and be a believer now.

Advertisements