Some upgrades are special


 

No Xen kernel

Yeah never forget when upgrading from older Alpine Linux that Xen itself moved into the xen-*-hypervisor package. Didn’t catch that on update and so I had no more hypervisor on the system.

Xen 4.6.0 crashes on boot

My experience: No you don’t need to compile a debug Xen kernel + Toolstack. No you don’t need a serial console. No you don’t need to attach it.

You need google and search for whatever random regression you hit.

In this case, if you set dom0_mem, it will crash instead of putting the memory in the unused pool: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810070

4.6.1 fixes that but isn’t in AlpineLinux stable so far.

So what I did was enabling autoballoon in /etc/xen/xl.conf. That’s one of the worst things you can do, ever. It slows down VM startup, has NO benefit at all, and as far as I know also increases memory fragmentation to the max. Which is lovely, especially considering Xen doesn’t detect this NUMA machine as one thanks to IBM’s chipset “magic”.

CPU affinity settings got botched

I had used a combination of the vcpu pinning / scheduling settings to make sure power management works all while dedicating 2 cores to dom0 for good performance. Normally with dom0 VCPU pinning you got a problem:

dom0 is being a good citizen, only using the first two cores. But all other VMs may also use those, breaking some of the benefits…

So what you’d do was have settings like this

 

memory = 8192
maxmem = 16384
name   = "xen49.xenvms.de"
vcpus  = 4
cpus = [ "^0,^1" ]

That tells Xen this VM can have 4 virtual CPUs, but they’ll never be scheduled on the first two cores (the dom0 ones).

Yeah except in Xen 4.6 no VM can boot like that.
The good news is a non-array’ed syntax works:

cpus = "^0,^1,2-7"

 

IBM RSA2

IBM’s certificates for this old clunker are expired. Solution to access?

Use Windows.

Oh, and if it’s in the BIOS from configuring the RSA module it’ll NEVER display anything, even if you reset the ASM. You need to reset the server once more. Otherwise you get a white screen. The recommended fix is to reinstall the firmware, which also gets you that reboot.

 

Alpine Linux

My network config also stopped working. That’s the one part I’d like to change in Alpine – not using that challenged “interfaces” file from Debian, but instead something that is more friendly to working with VLANs, Bridges and Tunnels.

If you’re used to it day-to-day it might “look” just fine but that’s because you don’t go out and compare to something else.

Bringing up a bridge interface was broken, because one sysctl knob was no longer supported. So it tried to turn off ebtables there, that didn’t work and so, hey, what should we do? Why not just stop bringing up any interfaces that are configured and completely mess up the IP stack?

I mean, what else would make sense to do than the WORST possible outcome?

If this were a cluster with a lost quorum I’d even agree. But you can bet the Linux kids will run that into a split brain with pride.

 

I’ll paste the actual config once I found how to take WordPress out of the idiot mode and edit html. Since, of course, pasting to a <PRE> section is utterly fucked up.

 

I removed my bonding config from this to be able to debug more easily. But the key points was to remove the echo calls and making the pre-up / post-up parts more reliable.

auto lo
iface lo inet loopback

auto br0
iface br0 inet static
    pre-up brctl addbr br0
    pre-up ip link set dev eth0 up
    post-up brctl addif br0 eth0
    address your_local_ip
    netmask subnet_mask
    broadcast subnet_bcast
    gateway your_gw
    hostname waxh0012
    post-down brctl delif br0 eth0
    post-down brctl delbr br0
           
                    
# 1 gbit nach aussen
auto eth0             
iface eth0 inet manual      
    up ip link set $IFACE up    
    down ip link set $IFACE down

Xen VMs not booting

This is some annoying thing with Alpine only. Some VMs just require you to press enter, them being stuck at the grub menu. It’s something with grub parsing the extlinux.conf that doesn’t work, most likely the “timeout” or “default” lines.
And of course the idiotic (hd0) default vs (hd0,0) from the grub-compat wrapper.
I think you can’t possibly shout too loud at any of the involved people since this all goes to the “why care?” class of issues.
(“Oh just use PV-Grub” … except that has another bunch of issues…)

Normally I don’t want to bother anymore reporting all the broken stuff I run into. It’s gotten just too much, basically I would just spend 2/3 of my day on that. But since this concerns a super-recent release of Alpine & Xen (and even some debian) I figured I’ll save people some of the PITA I spend my last hours on.
When able I dump them to my confluence at this url:
Adminspace – Fixlets

I also try really hard to not rant there 🙂

Update:
Nathanael Copa reached out to me and let me know that the newer bridge packages on Alpine include a lot more helper scripts. That way the icky settings from the config would not have been needed any more.
Another thing one can do is to do

post-up my-command || echo "didnt work"

you should totally not … need to do that, but it helps.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s