As kids we would wonder, but we didn’t know 2016


As kids we would wonder why noone treated us seriously, like grown-ups, like someone capable of reasoning.

But it’s not about that.

It’s about letting someone keep a glimpse of trust, looking back to a few years where they didn’t yet see wars starting, not being able to do anything to stop an escalation.

Families getting wiped out as a necessity (to someone, after all) on a mountain roadside because they happened to watch a spies’ assassination.

Prisons that have been given up to the inmates and just patrolled from the outside.

Watching your favourite artists die.

Hearing that a friend committed suicide.

Actually, people getting so deeply hopeless that they willingly crash their own airplane, wiping out whole school classes.

Undercover investigators who were the actual enabler of the Madrid subway bombing. Knowing how they’ll also forever be lost in their guilt, not making anything better.

Seeing how Obama’s goodbye gets drowned in humanity wondering if Trump had women pissing on <whatever> for money or not.

Then asking yourself why that even would matter considering T’s definitely a BAD PERSON so who cares what kind of sex he’s into, why can one’s private details take attention from the actual fact that he’s absolutely not GOOD?

Watching a favorite place to be torn down for steel and glass offices.

Understanding what a burnt down museum means.

Life’s inevitable bits, to be confronted with them works only if you had a long peaceful period in your life.

And that, that’s what you really just shouldn’t see or rather understand too early.

After all, there’s an age where we all tried to get toothpaste back into the tube, just because we’d not believe it just doesn’t work that way.

 

Sorry for this seemingly moody post, it’s really been cooking since that 2012 murder case. Today

 

 

On the pro side, there’s movies, Wong Kar-Wai and so many more. There’s art, and the good news is that we can always add more art, and work more towards the world not being a shithole for the generations after us.

But, seriously, you won’t be able to do much good if you look at the burning mess right from the start.

Some upgrades are special


 

No Xen kernel

Yeah never forget when upgrading from older Alpine Linux that Xen itself moved into the xen-*-hypervisor package. Didn’t catch that on update and so I had no more hypervisor on the system.

Xen 4.6.0 crashes on boot

My experience: No you don’t need to compile a debug Xen kernel + Toolstack. No you don’t need a serial console. No you don’t need to attach it.

You need google and search for whatever random regression you hit.

In this case, if you set dom0_mem, it will crash instead of putting the memory in the unused pool: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810070

4.6.1 fixes that but isn’t in AlpineLinux stable so far.

So what I did was enabling autoballoon in /etc/xen/xl.conf. That’s one of the worst things you can do, ever. It slows down VM startup, has NO benefit at all, and as far as I know also increases memory fragmentation to the max. Which is lovely, especially considering Xen doesn’t detect this NUMA machine as one thanks to IBM’s chipset “magic”.

CPU affinity settings got botched

I had used a combination of the vcpu pinning / scheduling settings to make sure power management works all while dedicating 2 cores to dom0 for good performance. Normally with dom0 VCPU pinning you got a problem:

dom0 is being a good citizen, only using the first two cores. But all other VMs may also use those, breaking some of the benefits…

So what you’d do was have settings like this

 

memory = 8192
maxmem = 16384
name   = "xen49.xenvms.de"
vcpus  = 4
cpus = [ "^0,^1" ]

That tells Xen this VM can have 4 virtual CPUs, but they’ll never be scheduled on the first two cores (the dom0 ones).

Yeah except in Xen 4.6 no VM can boot like that.
The good news is a non-array’ed syntax works:

cpus = "^0,^1,2-7"

 

IBM RSA2

IBM’s certificates for this old clunker are expired. Solution to access?

Use Windows.

Oh, and if it’s in the BIOS from configuring the RSA module it’ll NEVER display anything, even if you reset the ASM. You need to reset the server once more. Otherwise you get a white screen. The recommended fix is to reinstall the firmware, which also gets you that reboot.

 

Alpine Linux

My network config also stopped working. That’s the one part I’d like to change in Alpine – not using that challenged “interfaces” file from Debian, but instead something that is more friendly to working with VLANs, Bridges and Tunnels.

If you’re used to it day-to-day it might “look” just fine but that’s because you don’t go out and compare to something else.

Bringing up a bridge interface was broken, because one sysctl knob was no longer supported. So it tried to turn off ebtables there, that didn’t work and so, hey, what should we do? Why not just stop bringing up any interfaces that are configured and completely mess up the IP stack?

I mean, what else would make sense to do than the WORST possible outcome?

If this were a cluster with a lost quorum I’d even agree. But you can bet the Linux kids will run that into a split brain with pride.

 

I’ll paste the actual config once I found how to take WordPress out of the idiot mode and edit html. Since, of course, pasting to a <PRE> section is utterly fucked up.

 

I removed my bonding config from this to be able to debug more easily. But the key points was to remove the echo calls and making the pre-up / post-up parts more reliable.

auto lo
iface lo inet loopback

auto br0
iface br0 inet static
    pre-up brctl addbr br0
    pre-up ip link set dev eth0 up
    post-up brctl addif br0 eth0
    address your_local_ip
    netmask subnet_mask
    broadcast subnet_bcast
    gateway your_gw
    hostname waxh0012
    post-down brctl delif br0 eth0
    post-down brctl delbr br0
           
                    
# 1 gbit nach aussen
auto eth0             
iface eth0 inet manual      
    up ip link set $IFACE up    
    down ip link set $IFACE down

Xen VMs not booting

This is some annoying thing with Alpine only. Some VMs just require you to press enter, them being stuck at the grub menu. It’s something with grub parsing the extlinux.conf that doesn’t work, most likely the “timeout” or “default” lines.
And of course the idiotic (hd0) default vs (hd0,0) from the grub-compat wrapper.
I think you can’t possibly shout too loud at any of the involved people since this all goes to the “why care?” class of issues.
(“Oh just use PV-Grub” … except that has another bunch of issues…)

Normally I don’t want to bother anymore reporting all the broken stuff I run into. It’s gotten just too much, basically I would just spend 2/3 of my day on that. But since this concerns a super-recent release of Alpine & Xen (and even some debian) I figured I’ll save people some of the PITA I spend my last hours on.
When able I dump them to my confluence at this url:
Adminspace – Fixlets

I also try really hard to not rant there 🙂

Update:
Nathanael Copa reached out to me and let me know that the newer bridge packages on Alpine include a lot more helper scripts. That way the icky settings from the config would not have been needed any more.
Another thing one can do is to do

post-up my-command || echo "didnt work"

you should totally not … need to do that, but it helps.

Your network is probably owned – What to do next?


I’ll try to summarize my thoughts after the pretty shocking 31C3 talk.

The talk was this one: Reconstructing .Narratives.

This trip to 31C3 was meant to be a normal educational excursion but it is now just depressing. The holes the NSA & friends rip into the networks we are looking after are so deep it’s hard to describe.

Our democratic governments using the data gathered for KILL LISTS of people, even assigning a “kill value” as in how many people are legit to kill if it helps the matter. This is something I can’t yet fit into my head. The political and technical aspects are covered on Spiegel.de.

Note that the info there will be extended in 3 weeks since there will be another drop of info regarding malware aspects.

Personally, I’m not feeling well just over what I heard there and I’m grateful they didn’t come around to the malware list.

Now I’ll go ahead on the tech side and talk about what you should consider, we NEED to clean up our networks.

This is not a check list. It is a list to start from.

Your admin workstation:

  • Buy a new one. Install Qubes as per https://qubes-os.org/
  • If your box runs it nicely, submit it to their HCL.
  • Talked to Joanna before this shaking talk, and I’ll write about my “interview” at a later time.
  • Use the TOR VM or another box with Tails for your FW downloads
  • I wish coreboot was actually usable, if you can help on that end, please do it.

Point of Administration MATTERS

  • IPSEC VPN with preshared keys: Not safe
  • IPSEC VPN: Should be safe?
  • PPTP VPN: (Obviously) Not safe
  • SSH: VERY VERY questionable
  • ISDN Callback: Sorry, that was only safe before IP was the standard. And maybe not then

So basically, if your servers aren’t in the cloud but in your basement, THAT IS A GOOD THING.

Really sorry but it has to be said.

Re-keying:

  • wipe your ssh host keys, regenerate them
  • Don’t use less than 4k keys.
  • include the routers and other networking equipment.
  • Drop ALL your admin keys
  • Regenerate them monthly
  • Be prepared to re-key once we find out what SSH ECDSA-style option is actually safe

SSH adjustments are now described very well at the following github url:
stribika – Secure Secure Shell

Passwords:

change passwords!

this is sounding funny and old, but since any connection you have ever made might get decrypted at a later time, you should consider all compromised.
I think it should also be a good thing[tm] to have separate passwords on the first line of jump hosts than on the rest of systems.

yes, keys seem safer. But i’ve been talking about passwords, which included issues like keystroke timing attacks on password based logins to systems further down the line.
of course applies to public keys; i.e. don’t overly enjoy agent forwarding. I’d rather not allow my “jump host login” key on the inner ring of systems.

Password management:

It seems the tool from Bruce Schneier is rather safe, I’d go away from the “common” choices like KeepassX.

Info / Download: https://www.schneier.com/passsafe.html

Firmware:

Make BIOS reflashing a POLICY.

Random number generators:

Expect you will need to switch them, personally I THINK you should immediately drop the comforts of haveged.

GnuPG

It was recommended more than one time.

Start using it more and more, putting more stuff in it than you’d have done till today.

Switches and routers:

Your network is NOT your friend.

  • IP ACLs are really a good thing to consider and piss off intruders
  • A good tool to set ACLs globally on your hardware is Googles capirca. Find it at a href=”https://code.google.com/p/capirca/”>https://code.google.com/p/capirca/. Shorewall etc. is more on the “nice for a host” level. We have come a long way with host based firewalls, but…
  • Think harder about how to secure your whole network. And how to go about replacing parts of it.

We can’t be sure which of our LAN active components are safe, your WAN probably IS NOT.

Clients

We really need to have PSF more commonspread.

Talk it over with your clients, how much ongoing damage is acceptable for helping the helpless XP users.

Guest WIFI

Do NOT run a flat home network.

Additions welcome, comment if you know something to *advance* things.

Bacula version clash between 5 and 7


This is the second time I run into the error “Authorization key rejected by Storage daemon.”

It makes backups and restores impossible. Most traces / explanations on the internet will point at FD hostname or SD hostname or key mismatch issues.

That is of course always possible, but if you had it working until a week ago when you updated – please don’t let them discourage you. This error will also occur for any version 7 client connecting to a version 5 server. I’ve had it on my Macbook after running “port upgrade outdated” and just now on my FreeBSD desktop during a migration restore.

The jobs will abort after the client is asked to send/receive files.

Debug output of the storage daemon shows that this is in fact a client error!

the red herring, a bacula error message saying

Authorization key rejected by Storage daemon

is completely wrong.

They just abstracted / objectified their logging a little too much. The SD received the error “client didn’t want me” and has to pass it own. Not helpful. Sorry guys 🙂

As a warning / example, here have a look at the log:

JobName: RestoreFiles
Bootstrap: /var/lib/bacula/mydir-dir.restore.1.bsr
Where:
Replace: always
FileSet: Full Set
Backup Client: Egal
Restore Client: Egal
Storage: PrimaryFileStorage-int
When: 2014-09-14 12:40:15
Catalog: MyCatalog
Priority: 10
Plugin Options: *None*
OK to run? (yes/mod/no): yes
Job queued. JobId=17300
*mess
14-Sep 12:40 waxu0604-dir JobId 17300: Start Restore Job RestoreFiles.
14-Sep 12:40 waxu0604-dir JobId 17300: Using Device "PrimaryFileDevice"
14-Sep 12:39 Egal JobId 17300: Fatal error: Authorization key rejected by Storage daemon.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_As[...]
*status client=Egal
Connecting to Client Egal at 192.168.xxx:9102

Egal Version: 5.2.12 (12 September 2012)  amd64-portbld-freebsd10.0
Daemon started 14-Sep-14 12:43. Jobs: run=0 running=0.
 Heap: heap=0 smbytes=21,539 max_bytes=21,686 bufs=50 max_bufs=51
 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 
Running Jobs:
Director connected at: 14-Sep-14 12:43
No Jobs running.
====

As you saw the restore aborts while a status client is doing just fine.
The same client is now running its restore without ANY issue after doing no more than downgrading the client to version 5.

*status client=Egal
Connecting to Client Egal at 192.168.xxx.xxx:9102

Egal Version: 5.2.12 (12 September 2012)  amd64-portbld-freebsd10.0
Daemon started 14-Sep-14 12:43. Jobs: run=0 running=0.
 Heap: heap=0 smbytes=167,811 max_bytes=167,958 bufs=96 max_bufs=97
 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 
Running Jobs:
JobId 17301 Job RestoreFiles.2014-09-14_12.49.00_41 is running.
      Restore Job started: 14-Sep-14 12:48
    Files=2,199 Bytes=1,567,843,695 Bytes/sec=10,812,715 Errors=0
    Files Examined=2,199
    Processing file: /home/floh/Downloads/SLES_11_SP3_JeOS_Rudder_[...]

All fine, soon my data will be back in place.

(Don’t be shocked by the low restore speed, my “server” is running the SDs off a large MooseFS share built out of $100 NAS storages.
I used to have the SDs directly on NAS and got better speeds with that but I like distributed storage better than speed)

Not enough desks to sufficiently bang your head on.


I’m not convinced. I have some LSI cards in several of my boxes and they
very consistently drop disks that are running SMART tests in the background.
I have yet to find a firmware or driver version that corrects this.

 

This is the kind of people I wish I never had to deal with.

(If not obvious: If your disks drop out while running SMART tests, look at the disks. There are millions of disks that handle this without issue. If yours drop out, they’re having a problem. Even if you think it doesn’t matter or even if it’s only showing with the controller. It doesn’t matter. Stuff it up.

I’m utterly sick of dealing with “admins” like this.)

LVM Mirroring #2


Hmm, people still look at my ages-old post about LVM all the time.

So, just a note from end-2013:

The mirror consistency stuff is not your worst nightmare anymore.

Barriers work these days, and I think it’s more important to concentrate on EXT4 Settings like “block_validity”. The chance of losing data due to a lvm mirror issue is much lower than the chance of unnoticed data loss in ext4 🙂

My LVM pain points, as of today, would be:

lvm.conf is a huge patchwork of added features, there should be a LVM maintainer that oversees structure as features are added.

Instead it’s like a castle with a lot of woodden gangways (mirrorlog devices) and stairs (thin provisioning) on the outside  and no windows (read up on the “fsck” utility for thin pools, TRY what happens if it runs full and recover from it)

Some features require pre-ahead planning and the way it’s now does not support that.

Reporting is still as bad as it used to be.

I’d be happy for someone to show me how he splits out a snapshot + pv to a backup host, brings it back AND has a fast resync.

(Note, the PV uuid wouldn’t change in this. So, if it doesn’t work, it hints at design flaws)

Those pieces I worry about. And really, the way the project is adding features without specs, documentation and (imho) oversight makes it looks like some caricature of a volume manager.

How I feel about that:

Just look, I have added the feature the others were talking about.

And look at THIS: I now have an arm on my ass so I can scratch between my shoulders, too!

Example: LVM2 did away with a per-LV header as classic LVM had, so you don’t have a ressource area to debug with, and don’t support BBR or per-LV mirror write consistency via the LV header. But instead they added an -optional- feature that wipes the start of an LV. So, if you lose config and rebuild a LV manually on the exact same sectors, but newer LVM config, it’ll wipe out the first meg of the LV.

A volume manager that after the design introduces kind of a LV format change, and make it WIPE DISK BLOCKS. I don’t care how smart you think you are: Whoever came up with this should get the old Mitnick jail sentence: Forbidden to use a computer.

The bad layering of PV/LV/VG I also still care about.

Storage management in the sense I’m used to is something I still don’t touch with LVM.

On the other hand I’m itching daily to actually make prod use of those exact modern features 🙂

But basically I just use it to carve out volumes, but instead of pvmove I might use more modern, powerful tools like blocksync/lvmsync and work with new VGs.

Also, just to be clear:

I’m not saying “don’t use LVM” – I have it on almost any system and hate those without it. I’m just saying it’s not delivering the quality for consistently successful datacenter usage. If you set up your laptop with a 1TB root volume and no LVM and then have some disk-related issue. Well, I’ll probably laugh and get you some Schnaps.

That being said, I wanted to write about more modern software that is actually fun to look at, next post 🙂

Splunk / Sumologic pricing just doesn’t work for me


I love Splunk, I really do. I’m using it since 2005 or so, and while I don’t have always a need for it, it often allowed me to build the impossible, up to crazy stuff like squeezing gig’s of carefully truncated logs over a ISDN line and then allowing everyone to happy analysis on them locally.

I just wish there was a different pricing model for troubleshooting only.

What I’d want now:

  • scponly access to a -oneshot uploader
  • 20GB volume
  • 3 days retention
  • <$150 price tag at 20GB, and under $50 for <5GB

Setting up a temporary Splunk VM and Splunk, even manually, takes less than an hour. But larger sets of logs will immediately hit the volume limit.

The thing is:

I don’t want to be able to upload 500mb each day for free and store it forever, as the basic license allows – that equals adding another 185GB per year btw.

I don’t want to be able to upload 20GB each day and store it for 30 days as Sumologic does around <this much>  money – that  equals placing 600GB of data on their disks continously.

What I often need is something that takes less than 5 min to access. That means, get it set up and ready to push existing logfiles to. Then work on them. Then forget about them.

The last time with Sumologic i had been faster installing a Splunk VM for oneshot than find out how to feed them all existing data now and still have it nicely indexed. But then the volume was just a few 100 meg.

And then I wanna run a few smart queries, maybe for 1-3 days, 7 days would be cool to even show the customer the findings online. But the actual value drops once I’m done searching.

So, one upload of data (assuming it worked), not one each day. And no, definitely I don’t need 30 days retention, after 30 days I don’t even want to remember I have the data there, much more so since most log data is sensitive and shouldn’t be stored for 30 days anyway.

Performance isn’t a great factor either, I’ll always get better one right here – if i feel it helps I can just turn on a real server, and a powerful one, not the run-the-mill cloud-SATA-stuff. So the selling point is simply to let me start analyzing faster than I can do now, and do it for less money:

The price limit is be set by the time I really care about this data, plus it has to be less than the few days for switching to logstash cost me?

The perfect features and speed of splunk versus the freedom i’d get with logstash (just think about using opennebula instances, spin up 4 fresh, dedicated instances for each time I need to analyze logs? how long would it take – 5 minutes?)

Let’s say I’d spend $5000 to set up a “perfect” logstash lab env, then it’d do this job well for at least 5 years.

Let’s ignore I’d then also earn money by selling such setups.

That means $999 per year would be the top limit to get the benefits off having a readily available system – especially if it means setting up readonly accounts for the affected parties etc. and all those Splunk enterprise features.

Splunk would run around $12k for this. Sumo around $5000, because they all assume I’d be pushing data daily and wanna store it.

I don’t have to. It doesn’t work.

Why log monitoring is great


Just found this while preparing for some prod updates and debugging why my account is broken on two of the servers I inherited

Oct  4 19:28:03 boxname updatething[60881]: [WARNING] no status file found (should only occur on first run)

Basically, even if it doesn’t say so, what it’s telling us is:

Sorry, I’m broken and I haven’t applied a single update since May 15th.

 

Just one silly check triggering on WARNING and they’d have known. This will be _hell_ to debug…

Working theory: This server slipt at the last patch cycle and didn’t get the updates since then..

Helpful change for OMD


The Open Monitoring Distribution (OMD) allows you to have multiple “sites” each consisting of configureable elements a Nagios (or Icinga, Shinken, Check_MK Microcore) instance, apache webserver and other tools.

Each site can be started/stopped individually, allowing you to take them offline for maintenance or have them in a cluster for failover.

The main apache on a system uses reverse proxies to let you access the “sites” and has always been able to tell you if a site wasn’t started at the moment.

This is done via a 503 ErrorDocument handler in the file “apache-own.conf”. It’s a nice feature but has a huge drawback if you run a kiosk mode browser for showing the monitoring dashboard on a TV or tablet (like me).

Once that page is displayed you’re out. You’ll never see that the site is back up.

I know 3 cases where this commonly becomes an issue:

  • Bootup of Nagios server with local terminal
  • Cluster failovers
  • Apache dies

The second one is the most annoying:

  • You have a GUI displaying valid info.
  • one of the servers has a problem and it triggers a failover
  • autorefresh kicks in and you get dropped to the 503 page
  • Cluster failover finishes
  • but nothing gets you back in.

Now, the fix is so easy you won’t believe it:

In apache-own.conf of your site, change the following:

from:

<Location /sitename>
ErrorDocument 503 “<h1>OMD: Site Not Started</h1>You need to start this site in order to access the web interface.”

to:

<Location /sitename>
ErrorDocument 503 “<META HTTP-EQUIV=\”refresh\” CONTENT=\”30\”><h1>OMD: Site Not Started</h1>You need to start this site in order to access the web interface.”

Restart the system apache (/etc/init.d/apache2 restart for most of us) and it’ll work.

file under:

I tried to develop a dev mindset, but found I like it when stuff really works.

About Disk partitioning


So, you always wondered why one would have a dozen logical volumes or filesystems on a server? And how it brings any benefit?

Let’s look at this example output from a live system with a database growing out of control:

Filesystem Size Used Avail Capacity Mounted on
/dev/da0s1a 495M 357M 98M 78% /
devfs 1.0k 1.0k 0B 100% /dev
/dev/da1s1d 193G 175G 2.3G 99% /home
/dev/da0s1f 495M 28k 456M 0% /tmp
/dev/da0s1d 7.8G 2.1G 5.1G 29% /usr
/dev/da0s1e 3.9G 1.0G 2.6G 28% /var
/dev/da2s1d 290G 64G 202G 24% /home/server-data/postgresql-backup

I’ll now simply list all problems that arise from this being mounted as a one, singular /home. Note, it would just be worse with a large rootdisk.

  • /home contains my admin homedirectory. So I cannot disable applications, comment /home in fstab, reboot and do maintenance. Instead all maintenance on this filesystem will need to start in singleuser mode.
  • /home contains not just the one PGSQL database with the obesity issues, it also hold a few mysql databases for web users. Effect: if it really runs full, it’ll also crash the other databases and all those websites.
  • /home being one thing for all applications I cannot just stop the database, umount, run fsck, change the root reserve from it’s default 8% – so there’s a whopping 20GB I cannot _get to_
  • /home being one thing means I also can’t do a UFS snapshot of just the database, with ease. Instead it’ll consist of all the data on this box, meaning it will have a higher change volume, leaving less time to magically copy this.
  • /home being the only big, fat filesystem also means I can’t just do fishy stuff and move some stuff out (oh and yes, there’s the backup filesystem. Accept I can’t use it)
  • PostgreSQL being in /home I cannot even discern the actual IO coming from it. Well, maybe Dtrace could, but all standard tools that use filesystem level metrics don’t stand a chance.
  • PostgreSQL being in /home instead of it’s own filesystem *also* means I can’t use a dd from the block device + fsck for the initial sync – instead I’ll run file-level using rsync…
  • It also means I can’t just stop the DB, snapshot, start it and pull the snapshot off to a different PC with a bunch of blazing fast low-quality SSDs for quick analysis.

 

I’m sure I missed a few points.

Any of them is going to cause hours and hours of workarounds.

 

Ok, this is a FreeBSD box, and one not using GEOM or ZFS – I don’t get many chances as it stands. So, even worse for me this is one stupid bloated filesystem.

 

Word of advice:

IF you ever think about “why should I run LVM on my box”, don’t think about the advantages right now, or the puny overhead for increasing filesystems as you need space. Think about what real storage administration (so, one VG and one LV in it doesn’t count) do for you if you NEED it.

Simply snapshotting a volume, adding PVs on-the-fly, attaching mirrors for migrations… this should be in your toolbox, and this should be on your systems.