No-copy extracting Xen VM tarballs to LVM


SUSE Studio delivers Xen VM images which is really nice. They contain a sparse image and a (mostly incomplete) VM config file. Since I’m updating them pretty often I needed a hack that saves on any unneeed copies and needs no scratch space, either.

Goal: save copy times and improve life quality instead of copying and waiting…

First, lets have a look at the contents and then let’s check out how to directly extract them…

(Oh. Great. Shitbuntu won’t let me paste here)

 

Well, great.

I’n my case the disk image is called:

SLES_11_SP3_JeOS_Rudder_client.x86_64-0.0.6.raw

It’s located in a folder named:

SLES_11_SP3_JeOS_Rudder_client-0.0.6/

 

So, what we can do is this:

First, set up some variables so we can shrink the command later on…

version=0.0.6
appliance=SLES_11_SP3_JeOS_Rudder_client
url=https://susestudio.com/...6_64-${version}.xen.tar.gz
appliance=SLES_11_SP3_JeOS_Rudder_client
folder=${appliance}-${version}
vmimage=${appliance}.x86_64-${version}.raw
lv=/dev/vgssdraid5/lvrudderc1

Then, tie it together to store our VM data.

wget -O- $url | tar -O -xzf - ${folder}/${vmimage} | dd of=$lv bs=1024k

Storing to a file at the same time:

wget -O- $url | tee /dev/shm/myfile.tar.gz | tar -O -xzf - ${folder}/${vmimage} |\
dd of=$lv bs=1024k

 

Wget will fetch the file, write it to STDOUT, tar will read STDIN, only extract the image file, and write the extracted data to STDOUT, which is then buffered and written by the dd.

 

If you’ll reuse the image for multiple VMs like me you can also write it to /dev/shm and, if RAM allows, also gunzip it. the gzip extraction is actually limiting the performance, and even tar itself seems to be a little slow. I only get around 150MB/s on this.

I do remember it needs to flatten out the sparse image while storing to LVM, but I’m not sure if / how that influences the performance.

 

(Of course none of this would be necessary if the OSS community hadn’t tried to ignore / block / destroy standards like OVF as much as they could. Instead OVF is complex, useless and unsupported. Here we are.)

Advertisements

LVM Mirroring #2


Hmm, people still look at my ages-old post about LVM all the time.

So, just a note from end-2013:

The mirror consistency stuff is not your worst nightmare anymore.

Barriers work these days, and I think it’s more important to concentrate on EXT4 Settings like “block_validity”. The chance of losing data due to a lvm mirror issue is much lower than the chance of unnoticed data loss in ext4 🙂

My LVM pain points, as of today, would be:

lvm.conf is a huge patchwork of added features, there should be a LVM maintainer that oversees structure as features are added.

Instead it’s like a castle with a lot of woodden gangways (mirrorlog devices) and stairs (thin provisioning) on the outside  and no windows (read up on the “fsck” utility for thin pools, TRY what happens if it runs full and recover from it)

Some features require pre-ahead planning and the way it’s now does not support that.

Reporting is still as bad as it used to be.

I’d be happy for someone to show me how he splits out a snapshot + pv to a backup host, brings it back AND has a fast resync.

(Note, the PV uuid wouldn’t change in this. So, if it doesn’t work, it hints at design flaws)

Those pieces I worry about. And really, the way the project is adding features without specs, documentation and (imho) oversight makes it looks like some caricature of a volume manager.

How I feel about that:

Just look, I have added the feature the others were talking about.

And look at THIS: I now have an arm on my ass so I can scratch between my shoulders, too!

Example: LVM2 did away with a per-LV header as classic LVM had, so you don’t have a ressource area to debug with, and don’t support BBR or per-LV mirror write consistency via the LV header. But instead they added an -optional- feature that wipes the start of an LV. So, if you lose config and rebuild a LV manually on the exact same sectors, but newer LVM config, it’ll wipe out the first meg of the LV.

A volume manager that after the design introduces kind of a LV format change, and make it WIPE DISK BLOCKS. I don’t care how smart you think you are: Whoever came up with this should get the old Mitnick jail sentence: Forbidden to use a computer.

The bad layering of PV/LV/VG I also still care about.

Storage management in the sense I’m used to is something I still don’t touch with LVM.

On the other hand I’m itching daily to actually make prod use of those exact modern features 🙂

But basically I just use it to carve out volumes, but instead of pvmove I might use more modern, powerful tools like blocksync/lvmsync and work with new VGs.

Also, just to be clear:

I’m not saying “don’t use LVM” – I have it on almost any system and hate those without it. I’m just saying it’s not delivering the quality for consistently successful datacenter usage. If you set up your laptop with a 1TB root volume and no LVM and then have some disk-related issue. Well, I’ll probably laugh and get you some Schnaps.

That being said, I wanted to write about more modern software that is actually fun to look at, next post 🙂

Linux LVM mirroring comes at a price


You can find some nice article about clvm mirroring here http://www.joshbryan.com/blog/2008/01/02/lvm2-mirrors-vs-md-raid-1

A reader had already tried to warn people but I think it went unheard

LVM is not safe in a power failure, it does not respect write barriers and pass those down to the lower drives.

hence, it is often faster than MD by default, but to be safe you would have to turn off your drive’s write caches, which ends up making it slower than if you used write barriers.

First of all, he’s right. More on that below. Also I find it kinda funny how he goes into turning off write caches. I was under the impression that NOONE is crazy enough to have write caches enabled in their servers, unless they’re battery backed and the local disk is only used for swap anyway. I mean, that was the one guy who at least know about the barrier issue and he thinks it’s safe to run with his cache turned on.

All the pretty little linux penguins look soooo much faster – as long as we just disable all those safeguards that people built into unix over the last 20 years 🙂

Anyway, back to LVM mirrors!

We just learned: All devicemapper based IO layers in Linux can/will lose barriers.

Furthermore LVM2 has it’s own set of issues, and it’s important to chose wisely – I think these are the most notable items that can give you lots of trouble in a mirror scenario:

  • no sophisticated mirror write consistency (and worse, people who are using –corelog)
  • only trivial mirror policies
  • no good per LE-PE sync status handling
  • (no PV keys either? – PV keys are used to hash LE-PE mappings independent of PVID)
  • limited number of mirrors (this can turn into a problem if you wanna move data with added redundancy during the migration)
  • no safe physical volume status handling
  • too many userspace components that will work fine as long as everything is ok but can die on you if something is broken
  • no reliable behaviour on quorum loss (VG should not activate, optionally the server should panic upon quorum loss, but at LEAST vgchange -a y should be able to re-establish the disks once their back). I sometimes wonder if the LVM2 even knows a quorum?!!
  • On standard distros nothing hooks into the lvm2 udev event handlers, so there are no reliable monitors for your status. Besides, the lvm2 monitors suck seem to be still in a proof-of-concept state…

since barriers are simply dropped in the devicemapper (not in LVM btw) you should chose wisely whether to use lvm2 mirrors for critical data mirroring.

Summary:

  • LVM mirror may look faster, but it comes at a price
  • Things tend to be slower if they do something the proper way.

Of course, if you’re using LVM on top of MD you *also* lose barriers.

Usually we can all live pretty well with either of those settings, but we should be aware there are problems and that we opted managability / performace over integrity.

Personally I’ll see the management advantages of LVM as high enough to accept the risk of FS corruption. I think the chance of losing data is much higher when I manually mess around with fdisk or parted and MD on every occasion I add a disk etc.

If it were very critical data you can either replicate in the storage array (without LVM and multipath??????) or scratch up the money for a Veritas FS/Volume Manger license (unless you’re a Xen user like me… 😦 )

either way…:

SET UP THE MONITORING.

 

A little update here:

According to the LVM article on wikipedia.com the kernels from 2.6.31 do handle barriers correctly even with LVM. On the downside that article only covers Linux LVM and imho has a lot of factual errors, so I’m not sure I’ll just go and be a believer now.