First look at the UP-board

I’ve finally got two UP Boards. After they arrived I also had ordererd another “Mean Well” dedicated 12V rail mount PSU, and some 2.1mm power cables.

The boards are nice little things with a lot of CPU power. Quad Atom with some cache, eMMC and enough RAM!

Photos, dmesg, hwinfo etc can be found here:

The basics:

My models have 2GB of ram which is shared with the onboard graphics.

The have a big front and back cooling plate, for hardcore usage there’s also an active fan in their shop.

Connectors: USB 2.0, 3.0, 3.0 OTG. The latter is a Macbook Air „style“ Typ-C flat connector. There’s also power (via a 2.1mm plug), HDMI and some other stuff I didn’t understand.

There’s one connector that has a industrial style plug. This port exposes 2x USB and a serial with BIOS forwarding. You should give in and just buy it, there’s no way you’ll easily find the plug on your own.

You’ll need this cable unless you only plan on a desktop use. It doesn’t come with a FTDI serial, so also make sure to get one of those.

The MMC reads at up to 141MB/s (pretty nice) and writes (fdatasync) up to 61MB/s (also pretty OK). TRIM does work.

The LAN interface is just a Realtek, connected via PCIe (2.5GT/s, x1).

BIOS stuff

On boot you’re greeted by a normal EFI shell, reminded me of my late HP-UX days, except here there is no SAN boot scan.

Pressing F7 gives you a boot menu which always also allows going to BIOS Setup, which is a normal phoenix-style menu. Very small and simple – that’s nice.

Serial forwarding is supported, I didn’t try netbooting yet.

OS (ubilinux)

I installed their “default” distro which is done by flashing the ISO to a stick (or putting it on a CD) and you have to take care to use a USB2.0 connector if it’s a USB3 stick or it won’t be detected(!)

The grub menu was really slow, while the BIOS had been quick.

Limiting the video ram to 64MB + UHD screen brought me a system that stopped working once X was up. I didn’t investigate that, instead I booted to single user mode and told systemd to make that a default (systemctl

Ubilinux is a Debian Jessie (sigh) but with some parts scrapped from Ubuntu (sigh).

It works and has all the stuff to i.e. access the GPIO connectors.

lm_sensors detected the coretemp CPU sensors, nothing else.

AES-NI was autoloaded.

The only thing I couldn’t make work yet was the hardware watchdog, which is an issue split between SystemD, packaging and probably something else.

This one gets a 9/10 which is rare 🙂

Your network is probably owned – What to do next?

I’ll try to summarize my thoughts after the pretty shocking 31C3 talk.

The talk was this one: Reconstructing .Narratives.

This trip to 31C3 was meant to be a normal educational excursion but it is now just depressing. The holes the NSA & friends rip into the networks we are looking after are so deep it’s hard to describe.

Our democratic governments using the data gathered for KILL LISTS of people, even assigning a “kill value” as in how many people are legit to kill if it helps the matter. This is something I can’t yet fit into my head. The political and technical aspects are covered on

Note that the info there will be extended in 3 weeks since there will be another drop of info regarding malware aspects.

Personally, I’m not feeling well just over what I heard there and I’m grateful they didn’t come around to the malware list.

Now I’ll go ahead on the tech side and talk about what you should consider, we NEED to clean up our networks.

This is not a check list. It is a list to start from.

Your admin workstation:

  • Buy a new one. Install Qubes as per
  • If your box runs it nicely, submit it to their HCL.
  • Talked to Joanna before this shaking talk, and I’ll write about my “interview” at a later time.
  • Use the TOR VM or another box with Tails for your FW downloads
  • I wish coreboot was actually usable, if you can help on that end, please do it.

Point of Administration MATTERS

  • IPSEC VPN with preshared keys: Not safe
  • IPSEC VPN: Should be safe?
  • PPTP VPN: (Obviously) Not safe
  • SSH: VERY VERY questionable
  • ISDN Callback: Sorry, that was only safe before IP was the standard. And maybe not then

So basically, if your servers aren’t in the cloud but in your basement, THAT IS A GOOD THING.

Really sorry but it has to be said.


  • wipe your ssh host keys, regenerate them
  • Don’t use less than 4k keys.
  • include the routers and other networking equipment.
  • Drop ALL your admin keys
  • Regenerate them monthly
  • Be prepared to re-key once we find out what SSH ECDSA-style option is actually safe

SSH adjustments are now described very well at the following github url:
stribika – Secure Secure Shell


change passwords!

this is sounding funny and old, but since any connection you have ever made might get decrypted at a later time, you should consider all compromised.
I think it should also be a good thing[tm] to have separate passwords on the first line of jump hosts than on the rest of systems.

yes, keys seem safer. But i’ve been talking about passwords, which included issues like keystroke timing attacks on password based logins to systems further down the line.
of course applies to public keys; i.e. don’t overly enjoy agent forwarding. I’d rather not allow my “jump host login” key on the inner ring of systems.

Password management:

It seems the tool from Bruce Schneier is rather safe, I’d go away from the “common” choices like KeepassX.

Info / Download:


Make BIOS reflashing a POLICY.

Random number generators:

Expect you will need to switch them, personally I THINK you should immediately drop the comforts of haveged.


It was recommended more than one time.

Start using it more and more, putting more stuff in it than you’d have done till today.

Switches and routers:

Your network is NOT your friend.

  • IP ACLs are really a good thing to consider and piss off intruders
  • A good tool to set ACLs globally on your hardware is Googles capirca. Find it at a href=””> Shorewall etc. is more on the “nice for a host” level. We have come a long way with host based firewalls, but…
  • Think harder about how to secure your whole network. And how to go about replacing parts of it.

We can’t be sure which of our LAN active components are safe, your WAN probably IS NOT.


We really need to have PSF more commonspread.

Talk it over with your clients, how much ongoing damage is acceptable for helping the helpless XP users.

Guest WIFI

Do NOT run a flat home network.

Additions welcome, comment if you know something to *advance* things.

Xen Powermanagement

Hi all,

this is a very hot week and the sun is coming down on my flat hard. Yet, I’m not outside having fun: Work has invaded this sunday.

I ran into a problem: I need to run some more loaded VMs but it’s going to be hotter than usual. I don’t wanna turn into a piece of barbeque. The only thing I could do is to turn my Xen host’s powersaving features to the max.

Of course I had to write a new article on power management in the more current Xen versions from that… 🙂

Find it here: Xen Power management – for current Xen.

When I saved it I found, I also have an older one (which i wasn’t aware of anymore) that covers the Xen 3.4 era.

Xen full powersaving mode – for Xen 3.x




Did you know those settings only take a mouse click in VMWare?

Not enough desks to sufficiently bang your head on.

I’m not convinced. I have some LSI cards in several of my boxes and they
very consistently drop disks that are running SMART tests in the background.
I have yet to find a firmware or driver version that corrects this.


This is the kind of people I wish I never had to deal with.

(If not obvious: If your disks drop out while running SMART tests, look at the disks. There are millions of disks that handle this without issue. If yours drop out, they’re having a problem. Even if you think it doesn’t matter or even if it’s only showing with the controller. It doesn’t matter. Stuff it up.

I’m utterly sick of dealing with “admins” like this.)

Part four: Storage migration, disaster recovery and friends

This article also was published a little too early….. 🙂


A colleague (actually, my team lead) and I set out to build a new, FreeBSD based storage domU.


The steps we did:

Updated Raid Firmware

re-flashing my M5015 Raid controller to more current, non-IBM firmware. We primarly hoped this would enable the SSDs write cache. Didn’t work. It was a little easier than expected since I had already done parts of the procedure.

Your most important command for this is “Adpallinfo”


Created Raid Luns

We then created a large bunch of Raid10 luns over 4 of the SSDs.

  • 32GB for the storage domU OS
  • 512MB for testing a controller-ram buffered SLOG
  • 16GB ZIL
  • 16GB L2ARC
  • 600odd GB “rest”

Configure PCI passthrough in Xen

There was a few hickups, the kernel command line just wouldn’t activate, nor did using modprobe.d and /etc/modules do the job on their own.

This is what we actually changed…

First, we obtained the right PCI ID using lspci (apk add pciutils)

daveh0003:~# lspci | grep -i lsi

01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 03)

in /etc/modules:


in /etc/modprobe.d/blacklist added:

blacklist megaraid_sas

in /etc/modprobe.d/xen-pciback.conf

options xen-pciback hide=(0000:01:00.0)

in /etc/update-extlinux.conf

default_kernel_opts=”modprobe.blacklist=megaraid_sas quiet” – we had also tried

#default_kernel_opts=”xen-pciback.hide='(01:00.0)’ quiet”

(btw, not escaping the paraentesis can cause busybox/openrc init to crash!!)

and, last, but not least I gave up annoyedly and put some stuff in /etc/rc.local

echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind

echo 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot

echo 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind

(and even this isn’t working without me manually calling it. It will take many more hours to get this to a state where it just works. If you ever wonder where the price of VMWare is justified… every-fucking-where)

FreeBSD storage domU

The storage domU is a pretty default install of FreeBSD10 to a 32GB LUN on the raid.

During install DHCP did not work ($colleague had also run into this issue) and so we just used a static IP… While the VM is called “freesd3” I also added a CNAME called “stor” for easier access.

The zpools are:

  • zroot (the VM itself)
  • zdata (SSD-only)
  • zdata2 (Disk fronted by SSD SLOG and L2ARC)

I turned on ZFS compression on most of those using the dataset names, i.e.:

set compression=lz4 zroot/var

VMs can later access this using iSCSI or as a Xen block device (we’ll get to that later!)

Now, for the actual problem. During installation, the device IDs had shifted. On FreeBSD this is highly uncommon to see and you *really* consider that a linux-only issue. Well, not true.


We selected “mfid0”, which should have been the 32GB OS Lun…

This is what MegaCli shows:

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Size                : 32.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 1 (Target Id: 1)
Size                : 3.182 TB
Sector Size         : 512
State               : Optimal
Strip Size          : 64 KB
Number Of Drives per span:2
Virtual Drive: 2 (Target Id: 2)
Size                : 512.0 MB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 3 (Target Id: 3)
Size                : 16.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 4 (Target Id: 4)
Size                : 64.0 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2
Virtual Drive: 5 (Target Id: 5)
Size                : 630.695 GB
Sector Size         : 512
State               : Optimal
Strip Size          : 128 KB
Number Of Drives per span:2

Note that the logical drive ID and Target:Lun match up just fine!



The OS side:

Please compare to what FreeBSD’s mfi driver assigns…

mfid0: 32768MB (67108864 sectors) RAID volume (no label) is optimal
mfid1: 512MB (1048576 sectors) RAID volume (no label) is optimal
mfid2: 16384MB (33554432 sectors) RAID volume (no label) is optimal
mfid3: 65536MB (134217728 sectors) RAID volume (no label) is optimal
mfid4: 645832MB (1322663936 sectors) RAID volume (no label) is optimal
mfid5: 3337472MB (6835142656 sectors) RAID volume 'raid10data' is optimal

At install time it was cute enough to *drums* assign the 3.X T lun as mfid0. So we installed FreeBSD 10 on the LUN that stores my VMs.

That, of course, killed the LVM headers and a few gigabytes of data.


My next post will skip over reinstalling to the right lun (identified from live cd system) and instead describe how I went about getting the data back.


Part three: Storage migration, disaster recovery and friends

All posts:

What I had not expected was how hard it would be to decide on an actual solution.

 Picking a Hypervisor

For a lab I would need:

  • nested virt
  • high performance
  • low overhead to the same due to power etc.
  • easy cloning of vms and labs
  • flexible networking
  • easy scripting
  • wide storage options and easy migration
  • thin provisioning of some kind


If you know all the products and their drawbacks it turned into a constant forth-and-back between the different hypervisors and ecosystems.



VMWare always sneaked back due to feature reliability and performance consistency and then got kicked back out for the lack of many features like API and storage migration w/o a full vCenter install.

I knew it would deliver a good (600-900MBish) performance under any circumstance, where i.e. Xen can be all over the place from 150 to 1900MB/s…

Another downside was that in VMWare  my SolarFlare 5122 will definitely never  expose the 256VNICs. And I’d like to have em.

Installing MegaCli in ESXi is also a bit annoying.

On the pro side there’s the Cisco Nexus1000V and many other similar *gasp* appliances.

And, the perfect emulation. No “half” disk drivers. no cheapass BIOS.

In the end, I like to have my stuff licensed and to use the full power of a VMWare setup I’d need to go with vCenter + Enterprise Lic. No fun.



Just LOL.

While XenServer has great features for VM Cloning it’s just not my cup of tea. Too much very bad python code. Too many windows-user cludges. Broken networking all over.

Any expectation of storage flexibility would be in vain, needing backporting and recompiling software to the dom0 kernel using their SDK. Definitely not an easy solution if you wanna be able to flip between iSCSI, Infiniband, md and whatever else *looks* interesting. This should be a lab after all, and I don’t see any chance running something like the Storwise VSA in this. Nested ESXi for that, and that’s not on the roadmap for XenServer. If anything still is.

It would probably work best for SolarFlare. I’ll admit that.



This is what will run in many VMs, but I don’t wanna break layering, so my underlying hypervisor and solution should not be the same as in the VMs. I am not yet sure if it’s the right decision.

This would be the prime KVM choice since they already deliver a well-tuned configuration.

What worries me is that, while MooseFS’ FUSE client scales good enough on a single hypervisor node, it would end up with a lot of additional context switching / trashing if I use it on the main node and in the clients. There might be smarter ways around this, i.e. by having a fat global pool in the “layer1” hypervisor and using that from the above layers, too. More probably it’d turn into a large disaster 🙂



Pointless, no hypervisor, one single kernel instance can’t successfully pretend being a bunch of OSDs and clients 🙂


Plain Xen:

This is what I already have and went with, especially to make use of tmem and run the Ceph labs as paravirt domUs. This way I know nothing will get in the way performance wise.

There’s one thing you should know though, comparing Xen vs. ESXi or a licensed VMWare though:

Xen’s powermanagement is brokenbrokenbroken:

  • Newer deep-idle CPU states are all unsupported
  • The utility to manage CPU power management is broken as well. Since 4.3 nothing works any more.
  • Even if you free + shutdown a core from dom0 it’ll not be put to sleep

You can definitely tell from the power intake fan speed that Xen, even idle consumes more power than an idle Linux kernel would. Spinning up a PV domU has no impact, spinning up a HVM one is also a noticable increase in fan whoosh.

ESXi is far better integrated so I am expecting like 100 Euro (personal unfunded opinion) per year of additional energy wasted over VMWare.

My choice for Xen is mostly

  • the bleeding edge features like tmem
  • the really crazy stuff like vTPM and whatever of the cool features ain’t broken at any given time.
  • leverage any storage trick I want and have available in a (thanks to Alpine Linux) very recent Linux kernel
  • put in place ZFS, maybe in a dedicated driver domain
  • also be able to use MooseFS and last, but most interesting
  • all the things that never work on other hypervisors – CPU hotplug, dynamic ram changes…
  • storage domUs!!!!!


I think in a lab running 20-30 loaded VMs it will be cruicial to optimize in the memory subsystem.

Same goes for having the least possible CPU overhead, under load this will help.

Last, concurrently being able to use different storage techs means I can chose different levels of availability and speed – albeit not _having to_ since there’s a large SSD monster underneath it.

I’m also quite sure the disks will switch from Raid10 to Raid5. They just won’t see any random IO any more.

The “Raid5 is not OK” Disclaimer

Oh, and yes. Just to mention it. I’m aware I’m running green drives behind a controller. I know about Raid5 rebuild times (actually, they’re much lower on HW raid. About 30% of software raid) and the thing is…

If I see disk dropouts (yet to be seen), I’ll replace the dumb thing ASAP. It makes me cringe to read about people considering this a raid controller issue. If the damn disk can’t read a block for so long that the controller drops it out… Then I’m glad I have that controller and it did the right thing.

Such (block errored) disks are nice as media in secondary NAS storage or as doorstops, but not for a raid. Maybe I just hit extremely lucky in having no media errors at all off them? Definitely not what you’d see in a dedicated server at a mass hoster.

I’ve also patched my Check_MK Smart plugin to track the smart stats from the raid PDisks, so anything SMART notices I’ll be immediately be aware of. Why the green disks in the first place? Well – power and noise benefits are huge. If I had some more space I’d consider a Raid6 of 8 of them, but not until I move to a bigger place.


Coming up next:

A colleague offered me some company when setting up a final storage layout.

We build a dedicated storage domU with PCI passthrough’ed MegaRaid controller and ZFS. The install had a little issue…

This is what the next posts will be about, one describing how to build a storage domU.

Also, what pitfalls to expect, and then a focus on losing data (sigh) and getting it back.

I’ll close with some lessons learned. 🙂

Part two: Storage migration, disaster recovery and friends

All posts:

 Go and find me a new host. Keep some money for foods.

So, in march and april I set out to build a *home* server that could handle a Ceph lab, and would behave mostly like real hardware. That equates to disks being slow, SSDs being fast, and RAM being, well, actual RAM. Writing to two disks should ideally also not immediately turn into an IO blender because they reside on one (uncached) spindle.

I think ocver all I spent 30 hours on Ebay and in shops to find good hardware for a cheap price.


This is what I gathered:

  • Xeon 2680V2 CPU (some ES model) with 8 instead of 10 cores but same 25MB of cache. It’s also overclockable, should I ever not resist that
  • Supermicro  X9SRL-F mainboard. There are better models with SAS and i350 NICs but I wanted to be a little more price-conservative there
  • 8x8GB DDR3 Ram which I recycled from other servers
  • 5x Hitachi SSD400M SSDs – serious business, enterprise SAS SSDs.
  • The old LSI 9260 controller
  • The old WD green disks

The other SSD option had been Samsung SM843T but their seller didn’t want to give out a receipt. I’m really happy I opted for “legit” and ended up with a better deal just a week later:

The Hitachis are like the big brother of the Intel DC S3700 SSD we all love. I had been looking for those on the cheap for like half a year and then hit lucky. At 400GB capacity each it meant I could make good use of VM cloning etc. and generally never look back to moving VMs from one pool to another for space.


I had (and still have) a lot of trouble with the power supply. Those intel CPUs take very low power on idle, even at the first stage of the boot. So the PSU, while on the intel HCL, would actually turn off after half a second when you had very few components installed. A hell of a bug to understand since you normally remove components to trace issues.

Why did I do that? oh, because the supermicro ipmi gave errors on some memory module. Which was OK but not fully supported. Supermicro is just too cheap to have good IPMI code.


Some benchmarking, using 4(!) SSDs was done and incredibly.

Using my LSI tuning script I was able to hit sustained 1.8GB/s writes and sustained 2.2GB/s reads.

After some more thinking I decided to check out Raid5 which (thanks to the controller using parity to calculate every 4th? block) still gave a 1.8GB/s read 1.2GB/s write.

Completely crazy performance.

To get the full Raid5 speed I had to turn on Adaptive Read Ahead. Otherwise it was around 500MB/s, aka a single SSDs read speed.

One problem that stuck around was that the controller would / will not enable the SSDs write cache, no matter what you tell it!

This is a huge issue considering each of those SSDs has 512MB(ytes) of well-protected cache.

The SSD is on LSIs HCL for this very controller so this is a bit of a bugger. I’ll get back to this in a later post since by now I *have* found something fishy in the controllers’ output that might be the cause.

Nonetheless: Especially in a raid5 scenario this will have a lot of impact on write latency and IOPS.

Oh, generally: this SSD model and latency? not a large concern 🙂


Part one: Storage migration, disaster recovery and friends

This is the first post of a series describing recent changes I did, some data loss, recovering from it and evaluating damage.
All posts:


Starting point.

I am building a new Xen Host for my home lab. It was supposed to handle one or two full Ceph labs at high load.The old machine just couldn’t do that.


What I had was a Core2 Q6600 quadcore CPU on an Intel S3210 board (IPMI, yay). It had 8GB of Ram, a IBM M5015 Raid Controller and Dual Nics. For storage I had a Raid10 over 4x2TB WD Green drives fronted by a Raid0 Flashcache Device build from two Samsung 830’s. Due to the old chipset the SSDs were limited somewhere around 730MB/s read/write speed.

The main problems were lack of CPU instructions (nested paging etc) for advanced or bleeding edge Xen features.

  • Memory overcommit using XenPaging only works if you have a more recent CPU than mine. (Of course this defeats the point since a more recent Xeon can handle enough RAM in the first place. But still)
  • The second thing was that PVH mode for FreeBSD needed a more recent CPU and last,
  • Nested Virt with Xen is getting somewhere which would be interesting for running ESXi or many Cloudweavers instances w/o performance impact

So, I couldn’t have many nice things!

Also I knew the consumer SSDs had too much latency for a highspeed cache.

For Ceph there was the added requirement of handling the Ceph Journals (SSD) for multiple OSDs and not exposing bottlenecks and IO variances from using the same SSD a dozen times.


I’m unhappy to replace the server while it was so far never really over 2-3% of average CPU – but since I want to do A LOT more with Ceph and Cloudweavers it was time to take a step forward. I spend some time calculating how far the step could be and  found that I would have to settle somewhere around ~1600 Euro for everything.

Zyxel NSA 325 supported WiFi adapters

Digging around in the sources I found (boot time… ) hotplug handling for two WiFi adapters.

So if you’d like to have WiFi with your NSA325, look for the two models mentioned here:

##### Check if ZyXEL NWD-211AN (0586/3418) is plugged
grep “Vendor=0586 ProdID=3418” /proc/bus/usb/devices > /dev/null 2>&1

##### Check if ZyXEL NWD-270N (0586/341A) is plugged
grep “Vendor=0586 ProdID=341a” /proc/bus/usb/devices

Zyxel NSA 365 Packaging

Small update – a few hours later (so that’s where the evening has gone)

Important links:

You must install “FFP”, an extended package manager

I had to do it mostly the manual way, using zypkg -i on the ffp package.

Setting up build env and packaging:

I’ve managed to add NFS (official package) and then the build env.

Bacula-SD built after I also added a MySQL package for linking into “bscan”.

I created a package but it still lacks a “start” script – it’s named start, but it seems the standard bacula-ctl-sd will do the job.

Sadly, I should still make a better package for this, then add also add a Check_MK agent package.


The performance of the NAS is as good as reported, I’m running a rsync over nfs (async,intr,soft,wsize=32768) with a constant 40-50MB/s. As a backup target this will definitely suffice.



The fan is not as loud as reported, so I postponed buying a Papst 612 FL fan for it.

I paid 109 Euro since I bought it in a local shop, online prices range as low as 72 Euros.

The build quality is uh… let me put it like this: in accordance with the price.

You unpack it, try to find how to open the door and then… front fell off.


One more small update since it’s so incredible:

Running FTP on a file on a USB3 stick attached to the front USB port.
So, I’m downloading from the NSA325 to my fileserver VM, which has a too tiny /dev/shm to fit the whole file.
First, I was quite happy seeing 88MB/s throughput. Then I figured “let’s set it to do 1MB readahead via /sys”.
Look at this:

ncftp /SanDisk-Extreme-00011 > get CentOS-6.2.img.bz2
CentOS-6.2.img.bz2: ETA: 0:04 104.32/577.56 MB 109.75 MB/s Local write failed after 211842400 bytes had been received: No space left on device.

109MB/s – so one 1MB/s per Euro spent 😉
One could consider turning on Jumbo frames, but at that speed, who would I be to not be content?