happy about VM compression on ZFS

Over the course of the weekend I’ve switched one of my VM hosts to the newer recommended layout for NodeWeaver – this uses ZFS, with compression enabled.

First, let me admit some things you might find amusing:

  • I found I had forgotten to add back one SSD after my disk+l2arc experiment
  • I found I had one of the two nodes plugged into its 1ge ports, instead of using the 10ge ones.

The switchover looked like this

  1. pick a storage volume to convert, write down the actual block device and the mountpoint
  2. tell LizardFS i’m going to disable it (prefix it with a * and kill -1 the chunkserver)
  3. Wait a bit
  4. tell LizardFS to forget about it (prefix with a # and kill -1 the chunkserver)
  5. umount
  6. ssh into the setup menu
  7. select ‘local storage’ and pick the now unused disk, assign it to be a ZFS volume
  8. quit the menu after successful setup of the disk
  9. kill -1 the chunkserver to enable it
  10. it’ll be visible in the dashboard again, and you’ll also see it’s a ZFS mount.

Compression was automatically enabled (lz4).

I’ve so far only looked at the re-replication speed and disk usage.

Running on 1ge I only got around 117MB/s (one of the nodes is on LACP and the switch can’t do dst+tcp port hashing so you end up in one channel.

Running on 10ge I saw replication network traffic to go up to 370MB/s.

Disk IO was lower since the compression already kicked, and the savings have been vast.

[root@node02 ~]# zfs get all | grep -w compressratio
srv0  compressratio         1.38x                  -
srv1  compressratio         1.51x                  -
srv2  compressratio         1.76x                  -
srv3  compressratio         1.53x                  -
srv4  compressratio         1.57x                  -
srv5  compressratio         1.48x                  -

I’m pretty sure ZFS also re-sparsified all sparse files, the net usage on some of the storages went down from 670GB to around 150GB.


I’ll probably share screenshots once the other node is also fully converted and rebalancing has happened.

Another thing I told a few people was that I ripped out the 6TB HDDs once I found that $customer’s OpenStack cloud performed slightly better than my home setup.

Consider that solved. 😉

vfile:~$ dd if=/dev/zero of=blah bs=1024k count=1024 conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.47383 s, 434 MB/s

(this is a network-replicated write…

make sure to enable the noop disk scheduler before you do that.

Alternatively, if there’s multiple applications on the server (i.e. a container hosting VM), use the deadline disk scheduler with the nomerges option set. That matters. Seriously 🙂


Happy hacking and goodbye

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s