OpenStacking


OpenStack

Notes from IBM OpenStack workshop I was at.

  • i haven’t seen a single thing that was exciting when you’ve seen and used multiple solutions.

Many sensible features (i.e. deployment hosts like Oracle VM had) are just being added and *lol* with the same naive approach. (Oh, let’s add a dedicated server for this. Oh, lets run all  deployment jobs in parallel there, so they trash the disks and we cripple the benefit it could have brought)

  • I haven’t seen a single thing that is done much better than what’s in OpenNebula (and there it would be much much more efficient to work with)
  • There is a chance that OpenStack with it’s different components will be better at doing one thing, and doing that one thing good: From what I’ve seen it has a lot less issues than OpenNebula “when something goes wrong”, but on the other hand everything is under a pile of APIs and services anyway.

So, from a birds eye view: What you can do can hardly go wrong, but you also can’t really do a lot. Especially for people coming from VMWare the current state of (opensource) affairs is insulting.

Some detail points:

hybrid cloud: generally not considered workable, except for “extremely dumb” workloads like web etc. For those on, most people will be better served with a CDN type setup

Some (cloud vendors) sales people are actually running around selling a hybrid cloud that look like this: you/they add a backup active directory controller at their datacenter.

This of course is not AT ALL hybrid cloud or “bursting” but poses a problem. Knowledgeable people saying “sorry, but hybrid fully dynamic cloud is still tricky” will not be closing any sale. Instead the completely clueless sales drone will do, since he promises that it will work. since neither he or the customer knows the original idea, this works out

why doesn’t it work so far:

api elasticity including buying of new vms etc. was said to be rarely working, much less so if bare metal bringup is involved (adding hosts to remote cloud etc)

shrink down is also apparently a pretty ugly topic.

(The focus in this question was OpenStack to OpenStack bursting mostly)

Misunderstandings are expectations:

Can my VM move to the cloud, from vCenter to an openstack at the ISP?

General answer: no

General expectation: why not?

I wonder: why not just provide a good P2V tool (i.e. platespin) so this is on the list?

Sadly the relation between data lock in (meaning safe revenues) and lack of workload portability did not come up as a topic

This is a downward spiral – if you buy infrastructure, you can save some admin money. yet, that takes away the skill level you’d need to simply overstep those portabilty restrictions. Any seasoned admin could plan and handle an iSCSI based online migration from Cloud A to Cloud B.

But running off an IaaS (or MSP) platform, you might not have that as an in-House skill any longer.

Also tools that handle cloud federation DO exist, but are completely unknown.

Examples are Panacea and Contrail (this isnt Contrail related to SDN).

It is around for much longer, probably works but nobody knows (of it).

Sad so many millions were spent there, spent on the right thing but ultimately nothing came of it so far.

I think this would need unusual steps, i.e. on every 10m invested in openstack there need to be put 100k into marketing rOCCI / OCCI.

A nice hack was using OVF (sick format nonetheless) to avoid cloud-init style VM contextualization.

On the infrastructure front, it was shocking to see the networking stack to higher detail (we worked in a “smaller” multi-tenant example with a few 100 vlans). The OpenStack people decided to keep distance from any existing standard (like qinq, S-VLAN, PBB-TE) and instead made a large pile of shit with a lot of temporary / interconnecting vlans / vswitches.

The greatest shit was to see what they did for mac adresses:

When Xen came out, the Xensource Inc. guys went to the IEEE and got their own mac prefix of 00:16:3e.

Someone figured the best way to use use fa:16:3e. Of course they didn’t register that.

Probably thought he’s the most clever guy in the universe, except he simply didn’t get it at all.

All cross-host connections are done using on-the-fly GRE tunnels and all hosts are apparently fully meshed. I suppose this whole setup + OpenVSwitch are so ineffecient it doesn’t matter any more?

There are other mode selectable, and it seems to me that flow based switching will be less bullshit than what Openstack does by default.

I hope they don’t find out about DMVPN.

Problems of datacenter extension and cross-site VLANs were a no-concern topic.

Having a flat L2 seems to be oh so pretty. I am in fear.

What else did I dig into:

Rate limiting in this mess is a neccessity but seems to be possible. workable.

There are some hints at offloading intra-host switching when using Emulex CNA or Mellanox. It seems not possible with Solarflare.

I’m pretty sure someone at Emulex knows how to do it. But it is not documented any place you could just find it.

Considing this would have massive (positive) performance impact, it’s just sad.

Digging takeaway:

I would try to only use SR-IOV HBAs, ensure QoS is enforced at ingress (that means, on the vm hosts, before customer traffic from a VM reaches the actual wires)

Unanswered:

IP address assignments. One thing we didn’t get to was that creating the network required setting up IP ranges etc.

I’m not sold on the “IaaS needs to provide XXX” story at all.

In summary, I want to provide customers with a network of their own, optionally providing dhcp/local dns/load balancers/firewall/etc.

But by default it should be theirs – let me put it like this:

When you say “IaaS” I read infrastructure as a service. Not infrastructure services as a service. I’m sure a medium sized dev team can nicely integrate an enterprise’s IPAM and DNS services with “the cloud” but I doubt it will provide any benefit over using their existing management stack. Except for the medium sized dev team of course.

What I see is cloud stacks that are cluttered with features that bring high value to very small, startup like environments (remember i.e. the average OpenStack install is <100 cores). It’s cool to have them, but the thing is: If you’re expecting people to use them, you’re doing it wrong. They’re trivial, puny and useless (i.e. “yes we can assign ipv4″, “yes we can assign ipv6″ – but what happens if you ask about dual stack? subnets?) and it’s becoming a bad joke to expect companies that do more than “something on the internet” to spend considerable time on de-integration of those startup convenience features.

Another interesting side note:

Softlayer is also Xen based. That’s the cloud service that made IBM suddenly the number one of the market.

Among Amazon, Rackspace, Linode, OrionVM and Softlayer using Xen, a 9x% VMWare share in the enterprise market (which is probably a lot bigger than cloud), I’m once again puzzled at the hybris of KVM community thinking they are the center of the universe. People tell me about oVirt / RHEV while it has NO RELEVANCE at all.

The only really cool KVM based place I know is Joyent. And they don’t even use Linux.

Oh, and, coming back to cloud, I’m still puzzled by the amount of attention Microsoft Azure gets in Germany. It seems the competitors (especially the higher end ones like, HP, IBM, Profitbricks, etc who actually offer SLAs worth the name) simply can’t get a shot at the Microsoft-addicted SMB and medium enterprise crowd.

That said (enough ranting) they are cool to have in a demo style setup like the one we played with.

IBM’s solution seems a nice middle ground – config adjustments are easily done, yet the deployment is highly automated and also highly reliable.

They’re going the right way, selling a rackful of servers with one usb stick to install the management server from. Wroooom[*].

Here’s your cloudy datacenter-in-a-box

p.s.: Wrooom was taking a little over an hour. Pretty different to what I’m used to with CFEngine now.

Ps2: Links: Contrail and http://contrail-project.eu/en_GB/federation and Panacea http://projects.laas.fr/panacea-cloud/node/31

Bacula version clash between 5 and 7


This is the second time I run into the error “Authorization key rejected by Storage daemon.”

It makes backups and restores impossible. Most traces / explanations on the internet will point at FD hostname or SD hostname or key mismatch issues.

That is of course always possible, but if you had it working until a week ago when you updated – please don’t let them discourage you. This error will also occur for any version 7 client connecting to a version 5 server. I’ve had it on my Macbook after running “port upgrade outdated” and just now on my FreeBSD desktop during a migration restore.

The jobs will abort after the client is asked to send/receive files.

Debug output of the storage daemon shows that this is in fact a client error, the bacula error message “

Authorization key rejected by Storage daemon

is completely wrong. They just abstracted / objectified their logging a little too much. The SD received the error “client didn’t want me” and has to pass it own. Not helpful. Sorry guys :)

As a warning / example, here have a look at the log:

JobName: RestoreFiles
Bootstrap: /var/lib/bacula/mydir-dir.restore.1.bsr
Where:
Replace: always
FileSet: Full Set
Backup Client: Egal
Restore Client: Egal
Storage: PrimaryFileStorage-int
When: 2014-09-14 12:40:15
Catalog: MyCatalog
Priority: 10
Plugin Options: *None*
OK to run? (yes/mod/no): yes
Job queued. JobId=17300
*mess
14-Sep 12:40 waxu0604-dir JobId 17300: Start Restore Job RestoreFiles.
14-Sep 12:40 waxu0604-dir JobId 17300: Using Device "PrimaryFileDevice"
14-Sep 12:39 Egal JobId 17300: Fatal error: Authorization key rejected by Storage daemon.
Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_As[...]
*status client=Egal
Connecting to Client Egal at 192.168.xxx:9102

Egal Version: 5.2.12 (12 September 2012)  amd64-portbld-freebsd10.0
Daemon started 14-Sep-14 12:43. Jobs: run=0 running=0.
 Heap: heap=0 smbytes=21,539 max_bytes=21,686 bufs=50 max_bufs=51
 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 
Running Jobs:
Director connected at: 14-Sep-14 12:43
No Jobs running.
====

As you saw the restore aborts while a status client is doing just fine.
The same client is now running its restore without ANY issue after doing no more than downgrading the client to version 5.

*status client=Egal
Connecting to Client Egal at 192.168.xxx.xxx:9102

Egal Version: 5.2.12 (12 September 2012)  amd64-portbld-freebsd10.0
Daemon started 14-Sep-14 12:43. Jobs: run=0 running=0.
 Heap: heap=0 smbytes=167,811 max_bytes=167,958 bufs=96 max_bufs=97
 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 
Running Jobs:
JobId 17301 Job RestoreFiles.2014-09-14_12.49.00_41 is running.
      Restore Job started: 14-Sep-14 12:48
    Files=2,199 Bytes=1,567,843,695 Bytes/sec=10,812,715 Errors=0
    Files Examined=2,199
    Processing file: /home/floh/Downloads/SLES_11_SP3_JeOS_Rudder_[...]

All fine, soon my data will be back in place.

(Don’t be shocked by the low restore speed, my “server” is running the SDs off a large MooseFS share built out of $100 NAS storages.
I used to have the SDs directly on NAS and got better speeds with that but I like distributed storage more than speed)

No-copy extracting Xen VM tarballs to LVM


SUSE Studio delivers Xen VM images which is really nice. They contain a sparse image and a (mostly incomplete) VM config file. Since I’m updating them pretty often I needed a hack that saves on any unneeed copies and needs no scratch space, either.

Goal: save copy times and improve life quality instead of copying and waiting…

First, lets have a look at the contents and then let’s check out how to directly extract them…

(Oh. Great. Shitbuntu won’t let me paste here)

 

Well, great.

I’n my case the disk image is called:

SLES_11_SP3_JeOS_Rudder_client.x86_64-0.0.6.raw

It’s located in a folder named:

SLES_11_SP3_JeOS_Rudder_client-0.0.6/

 

So, what we can do is this:

First, set up some variables so we can shrink the command later on…

version=0.0.6
appliance=SLES_11_SP3_JeOS_Rudder_client
url=https://susestudio.com/...6_64-${version}.xen.tar.gz
appliance=SLES_11_SP3_JeOS_Rudder_client
folder=${appliance}-${version}
vmimage=${appliance}.x86_64-${version}.raw
lv=/dev/vgssdraid5/lvrudderc1

Then, tie it together to store our VM data.

wget -O- $url | tar -O -xzf - ${folder}/${vmimage} | dd of=$lv bs=1024k

Storing to a file at the same time:

wget -O- $url | tee /dev/shm/myfile.tar.gz | tar -O -xzf - ${folder}/${vmimage} |\
dd of=$lv bs=1024k

 

Wget will fetch the file, write it to STDOUT, tar will read STDIN, only extract the image file, and write the extracted data to STDOUT, which is then buffered and written by the dd.

 

If you’ll reuse the image for multiple VMs like me you can also write it to /dev/shm and, if RAM allows, also gunzip it. the gzip extraction is actually limiting the performance, and even tar itself seems to be a little slow. I only get around 150MB/s on this.

I do remember it needs to flatten out the sparse image while storing to LVM, but I’m not sure if / how that influences the performance.

 

(Of course none of this would be necessary if the OSS community hadn’t tried to ignore / block / destroy standards like OVF as much as they could. Instead OVF is complex, useless and unsupported. Here we are.)

Blackhat 2014 talks you should really really look at


This is my watchlist compiled from the 2014 agenda, many of those talks are important if you want to be prepared of future and current issues.

Very great to see there’s also a few talks that fall more into the “defense” category.

 

# Talks concerning incredibly big and relevant issues. I filed those under “the world is gonna end”.

The first two are worthy of that and hopefully wake up people in the respective design bodies:

  • CELLULAR EXPLOITATION ON A GLOBAL SCALE: THE RISE AND FALL OF THE CONTROL PROTOCOL
  • ABUSING MICROSOFT KERBEROS: SORRY YOU GUYS DON’T GET IT

Also annoying to horrible threats

  • EXTREME PRIVILEGE ESCALATION ON WINDOWS 8/UEFI SYSTEMS
  • A PRACTICAL ATTACK AGAINST VDI SOLUTIONS
  • BADUSB – ON ACCESSORIES THAT TURN EVIL
  • A SURVEY OF REMOTE AUTOMOTIVE ATTACK SURFACES

 Things that will actually help improve security practices and should be watched as food for thought

  • OPENSTACK CLOUD AT YAHOO SCALE: HOW TO AVOID DISASTER
  • CREATING A SPIDER GOAT: USING TRANSACTIONAL MEMORY SUPPORT FOR SECURITYo
  • BUILDING SAFE SYSTEMS AT SCALE – LESSONS FROM SIX MONTHS AT YAHOO
  • BABAR-IANS AT THE GATE: DATA PROTECTION AT MASSIVE SCALE
  • FROM ATTACKS TO ACTION – BUILDING A USABLE THREAT MODEL TO DRIVE DEFENSIVE CHOICES
  • THE STATE OF INCIDENT RESPONSE

What could end our world five years from now:

  • EVASION OF HIGH-END IPS DEVICES IN THE AGE OF IPV6

note, memorize, listen to recommendations

  • HOW TO LEAK A 100-MILLION-NODE SOCIAL GRAPH IN JUST ONE WEEK? – A REFLECTION ON OAUTH AND API DESIGN IN ONLINE SOCIAL NETWORKS
  • ICSCORSAIR: HOW I WILL PWN YOUR ERP THROUGH 4-20 MA CURRENT LOOP
  • MINIATURIZATION

scada / modbus / satellites

  • THE NEW PAGE OF INJECTIONS BOOK: MEMCACHED INJECTIONS
  • SATCOM TERMINALS: HACKING BY AIR, SEA, AND LAND
  • SMART NEST THERMOSTAT: A SMART SPY IN YOUR HOME
  • SVG: EXPLOITING BROWSERS WITHOUT IMAGE PARSING BUGS
  • THE BEAST WINS AGAIN: WHY TLS KEEPS FAILING TO PROTECT HTTP

Don’t recall what those two were about

  • GRR: FIND ALL THE BADNESS, COLLECT ALL THE THINGS
  • LEVIATHAN: COMMAND AND CONTROL COMMUNICATIONS ON PLANET EARTH

Xen Powermanagement


Hi all,

this is a very hot week and the sun is coming down on my flat hard. Yet, I’m not outside having fun: Work has invaded this sunday.

I ran into a problem: I need to run some more loaded VMs but it’s going to be hotter than usual. I don’t wanna turn into a piece of barbeque. The only thing I could do is to turn my Xen host’s powersaving features to the max.

Of course I had to write a new article on power management in the more current Xen versions from that… :)

Find it here: Xen Power management - for current Xen.

When I saved it I found, I also have an older one (which i wasn’t aware of anymore) that covers the Xen 3.4 era.

Xen full powersaving mode - for Xen 3.x

 

 

 

Trivia:
Did you know those settings only take a mouse click in VMWare?

Check_MK support for Allnet 3481v2


A friend of mine has had this thermometer and asked me to look into monitoring and setup.

I don’t think I ever put as much work into monitoring such a tiny device. Last evening and almost night I stabbed at it some more and finally completed the setup and documentation. I literally went to bed at 5am because of this tiny sensor.

To save others from this (and to make sure I have a reliable documentation for it…), I’ve made a wiki article out of the pretty tricky setup. Along the way I even found it still runs an old openssl.

You can check it out here:

http://confluence.wartungsfenster.de/display/Adminspace/Monitoring+Allnet+3418v2

The bitbucket version isn’t yet committed, I hope I will do this in a moment… :p
One interesting hurdle was I couldn’t do a check_mk package (using mkp list / mkp pack) since I also needed to include things from local/lib and similar folders. When I visit the MK guys again I’ll nag about this.

 

 

They have really pretty meters in their UI by the way.

Would hope something like it makes it to the nagvis exchange some day.

edit note: I initially wrote it has an “affected OpenSSL”. It seems they had built it back in 2012 without heartbeat, which is a nice and caring thing to do.
It’s still goddamn outdated.

Friday special: screenrc for automatic IRSSI start


Just wanted to share a little snippet.

This is my SSH+Screen config for my IRC box:

  • If I connect from any private system, I’ll get my irc window.
  • If it rebooted or something, the screen command will automatically re-create an IRSSI session for me.
  • If I detach the screen, i’m automatically logged out.
~$ cat .ssh/authorized_keys
command="screen -d -RR -S irc -U" ssh-[ key removed] me@mypc

The authorized keys settings enforce running only _this_ command, and the screen options set a title for, force-detach, force-reattach and force-create a screen session by the name “irc”.

~$ cat .screenrc 
startup_message off
screen -t irssi 1 irssi

The screenrc does the next step by auto-running irssi in win1 with title accordingly set.
(And it turns off the moronic GPL notice)
Irssi in itself is configured to autoconnect to the right networks and channels, of course. (to be honest: Irssi config is something I don’t like to touch more than every 2-3 years.)

On the clients I also have an alias in /etc/hosts for it, so if I type “ssh irc”, I’ll be right back on irc. Every time and immediately.

 

This is the tiny little piece of perfect world I was able to create, so I thought I’d share it.

FreeBSD periodic mails vs. monitoring


I love FreeBSD! Taking over a non-small infrastructure of around 75 FreeBSD servers was something I wouldn’t have wanted to pass on.

The problem bit is that I do consulting only, not pure ops. But there wasn’t much of an ops team left…

Where they used to put around 10 man-days per week into the feeding and care of FreeBSD plus some actual development, I’m now trying to do something in 1 day. And I still want it to be a well-run, albeit slower, ship.

One of the biggest hurdles was the sheer volume of email.

Adding up Zabbix alerts (70% of which concerning _nothing)), the FreeBSD periodic mails, cron outputs, and similar reporting I would see weeks with 1500+ mails or in the higher 1000s if there was any actual issues. Each week. Just imagine what it looked like when I didn’t visit my customer for 3 weeks…

Many of those mails have no point at all once You’re running more than -base:

The most typical example would be bad SSH logins. All those servers run software to block attackers and even feed that info back to a central authority and log there. So, why in hell would I want to know about malicious SSH connects?

Would you like a mail that tells you no hardware device has failed, today?

  • And another one every day until 2032?
  • From all servers?

This makes no sense.

Same goes for the mails that tell me about neccessary system updates.

What I’ve done so far can be put in those 3 areas:

1. Periodics:

Turn off as much of the periodic mails as possible (i.e. anything that is possible to see by other means). I tried to be careful with it, but it didn’t work like this. My periodic.conf looks like this now:

freebsd periodic.conf
I found turning off certain things like the “security mail” also disables portaudit DB updates. But I just changed my portaudit call to include the download. Somehow I had assumed that *update* would be separate from *report*.

2. Fix issues:

Apply fixes for any bugs that are really that, bugs. At least if I figure out how to fix them. More often than not I’ll hit a wall in between the NIH config management and bad perl code.

3. Monitor harder, but also smarter:

Put in better monitoring, write custom plugins for anything I need (OpenSSH Keys, Sendmail queues, OS Updates) and set thresholds to either a baseline value for “normal” systems or to values derived from peak loads for “busy” systems.

Some of the checks are to be found at my bitbucket, and honestly, I’m still constantly working on them.

https://bitbucket.org/darkfader/nagios/src/cc233b93c106166a5494d7488c38880df0a5946b/check_mk/freebsd_updates/?at=default

The checked in version might change quite often, I.e. I now think it won’t hurt to have a stronger separation of reporting for OS and Ports issues. And, maybe a check that tells me if I still need a reboot for a system.

The most current area now is automating the updates.

I’m taming the VMWare platform and using some Pysphere code to create VM snapshots on the fly. So there’s an ansible playbook that pulls updates. It’ll then check if there is a mismatch between the version reported from uname -a and the “tag” file from freebsd-update. In that case, it’ll trigger a VM snapshot and install / reboot.

Another piece of monitoring does a grep -R -e  “^<<<<<” -e “>>>>>” /etc and as such alerts me of unmerged files.

I try to do with tiny little pieces and have everything a dual-use (agriculture and weapons, you know) technology that gives me status reporting and status improvement.

I started a howto about the specifics I did in monitoring, see
FreeBSD Monitoring at my adminspace wiki.

Ansible FreeBSD update fun…


Using Ansible to make my time at the laundry place more interesting…

 

me@admin ~/playbooks]$ ansible-playbook -i hosts freebsd-updates.yml

PLAY [patchnow:&managed:&redacted-domain:!cluster-pri] *************

GATHERING FACTS ****************************************************
ok: [portal.dmz.redacted-domain.de]
ok: [carbon.dmz.redacted-domain.de]
ok: [irma-dev.redacted-domain-management.de]
ok: [lead.redacted-domain-intern.de]
ok: [polonium.redacted-domain-management.de]
ok: [silver.redacted-domain-management.de]
ok: [irma2.redacted-domain-management.de]
ok: [inoxml-89.redacted-domain-management.de]

TASK: [Apply updates] **********************************************
changed: [inoxml-89.redacted-domain-management.de]
changed: [carbon.dmz.redacted-domain.de]
changed: [portal.dmz.redacted-domain.de]
changed: [irma-dev.redacted-domain-management.de]
changed: [lead.redacted-domain-intern.de]
changed: [polonium.redacted-domain-management.de]
changed: [silver.redacted-domain-management.de]
changed: [irma2.redacted-domain-management.de]
 finished on lead.redacted-domain-intern.de
 finished on portal.dmz.redacted-domain.de
 finished on silver.redacted-domain-management.de
 finished on inoxml-89.redacted-domain-management.de
 finished on carbon.dmz.redacted-domain.de
 finished on polonium.redacted-domain-management.de
 finished on irma-dev.redacted-domain-management.de
 finished on irma2.redacted-domain-management.de

TASK: [Reboot] ****************************************************
changed: [carbon.dmz.redacted-domain.de]
changed: [portal.dmz.redacted-domain.de]
changed: [inoxml-89.redacted-domain-management.de]
changed: [irma-dev.redacted-domain-management.de]
changed: [lead.redacted-domain-intern.de]
changed: [polonium.redacted-domain-management.de]
changed: [silver.redacted-domain-management.de]
changed: [irma2.redacted-domain-management.de]

TASK: [wait for ssh to come back up] *******************************
ok: [portal.dmz.redacted-domain.de]
ok: [irma-dev.redacted-domain-management.de]

I now use a “patchnow” group to have some decision maker because *surprise* I don’t want to snapshot and patch all systems at once.

Quite annoying that the most fundamential admin decisions are always really tricky to put in automation systems (written by devs). Also, I’ll need to kick my own ass since the playbook didn’t trigger the snapshots anyway!

For the long term solution I think I’ll first define a proper policy based on factors like this:

  • How mature the installed OS version & patches are (less risk of patching)
  • How exposed the system is
  • The number of users affected by the downtime
  • The time needed for recovery

What factors do you look at?

Not enough desks to sufficiently bang your head on.


I’m not convinced. I have some LSI cards in several of my boxes and they
very consistently drop disks that are running SMART tests in the background.
I have yet to find a firmware or driver version that corrects this.

 

This is the kind of people I wish I never had to deal with.

(If not obvious: If your disks drop out while running SMART tests, look at the disks. There are millions of disks that handle this without issue. If yours drop out, they’re having a problem. Even if you think it doesn’t matter or even if it’s only showing with the controller. It doesn’t matter. Stuff it up.

I’m utterly sick of dealing with “admins” like this.)