Part four: Storage migration, disaster recovery and friends


This article also was published a little too early….. 🙂

 

A colleague (actually, my team lead) and I set out to build a new, FreeBSD based storage domU.

 

The steps we did:

Updated Raid Firmware

re-flashing my M5015 Raid controller to more current, non-IBM firmware. We primarly hoped this would enable the SSDs write cache. Didn’t work. It was a little easier than expected since I had already done parts of the procedure.

Your most important command for this is “Adpallinfo”

 

Created Raid Luns

We then created a large bunch of Raid10 luns over 4 of the SSDs.

  • 32GB for the storage domU OS
  • 512MB for testing a controller-ram buffered SLOG
  • 16GB ZIL
  • 16GB L2ARC
  • 600odd GB “rest”

Configure PCI passthrough in Xen

There was a few hickups, the kernel command line just wouldn’t activate, nor did using modprobe.d and /etc/modules do the job on their own.

This is what we actually changed…

First, we obtained the right PCI ID using lspci (apk add pciutils)

daveh0003:~# lspci | grep -i lsi

01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 03)

in /etc/modules:

xen_pciback

in /etc/modprobe.d/blacklist added:

blacklist megaraid_sas

in /etc/modprobe.d/xen-pciback.conf

options xen-pciback hide=(0000:01:00.0)

in /etc/update-extlinux.conf

default_kernel_opts=”modprobe.blacklist=megaraid_sas quiet” – we had also tried

#default_kernel_opts=”xen-pciback.hide='(01:00.0)’ quiet”

(btw, not escaping the paraentesis can cause busybox/openrc init to crash!!)

and, last, but not least I gave up annoyedly and put some stuff in /etc/rc.local

echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind

echo 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot

echo 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind

(and even this isn’t working without me manually calling it. It will take many more hours to get this to a state where it just works. If you ever wonder where the price of VMWare is justified… every-fucking-where)

FreeBSD storage domU

The storage domU is a pretty default install of FreeBSD10 to a 32GB LUN on the raid.

During install DHCP did not work ($colleague had also run into this issue) and so we just used a static IP… While the VM is called “freesd3” I also added a CNAME called “stor” for easier access.

The zpools are:

  • zroot (the VM itself)
  • zdata (SSD-only)
  • zdata2 (Disk fronted by SSD SLOG and L2ARC)

I turned on ZFS compression on most of those using the dataset names, i.e.:

set compression=lz4 zroot/var

VMs can later access this using iSCSI or as a Xen block device (we’ll get to that later!)

Now, for the actual problem. During installation, the device IDs had shifted. On FreeBSD this is highly uncommon to see and you *really* consider that a linux-only issue. Well, not true.

Install

We selected “mfid0”, which should have been the 32GB OS Lun…

This is what MegaCli shows:

<<<megaraid_ldisks>>>
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Size                : 32.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 1 (Target Id: 1)
Size                : 3.182 TB
Sector Size         : 512
State               : Optimal
Strip Size          : 64 KB
Number Of Drives per span:2
Virtual Drive: 2 (Target Id: 2)
Size                : 512.0 MB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 3 (Target Id: 3)
Size                : 16.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 4 (Target Id: 4)
Size                : 64.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 5 (Target Id: 5)
Size                : 630.695 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2

Note that the logical drive ID and Target:Lun match up just fine!

 

 

The OS side:

Please compare to what FreeBSD’s mfi driver assigns…

mfid0: 32768MB (67108864 sectors) RAID volume (no label) is optimal
mfid1: 512MB (1048576 sectors) RAID volume (no label) is optimal
mfid2: 16384MB (33554432 sectors) RAID volume (no label) is optimal
mfid3: 65536MB (134217728 sectors) RAID volume (no label) is optimal
mfid4: 645832MB (1322663936 sectors) RAID volume (no label) is optimal
mfid5: 3337472MB (6835142656 sectors) RAID volume 'raid10data' is optimal

At install time it was cute enough to *drums* assign the 3.X T lun as mfid0. So we installed FreeBSD 10 on the LUN that stores my VMs.

That, of course, killed the LVM headers and a few gigabytes of data.

 

My next post will skip over reinstalling to the right lun (identified from live cd system) and instead describe how I went about getting the data back.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s