Jinn Koriech's Blog: 2010

Tuesday, December 28, 2010

LXC Linux Containers, Ubuntu & udev

I recently started using linux containers instead of xen virtualization. It's not a fully mature setup yet, but I prefer the approach for what my needs are. Plus with the evolving cgroups feature in the kernel it's shaping up to be an efficient way to have multiple independent environments without the overhead of virtualization. For example, IIRC there are fewer context switches required when using LXC to access the network.

I have a base host of Debian Squeeze (currently in testing as of this writing). I have Debian Lenny, Ubuntu Lucid, and Gentoo as guest systems. The Debian squeze installer works well for Lenny and Lucid, but the Ubuntu folks haven't taken the necessary steps to make Ubuntu play nice in a container.

One main glitch I found with Ubuntu Lucid was that during the regular system upgrades I received a new udev package, which started causing problems with dpkg. Essentially we don't want to have udev in the guest since the host deals with the /dev/ filesystem. If your container is set up with a default deny on the dev fs, then you'll have seen the below errors:

Setting up udev (151-12.2) ...
mknod: `/lib/udev/devices/ppp': Operation not permitted
dpkg: error processing udev (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of plymouth:
 plymouth depends on udev (>= 149-2); however:
  Package udev is not configured yet.
dpkg: error processing plymouth (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 udev
 plymouth
E: Sub-process /usr/bin/dpkg returned an error code (1)
A package failed to install.  Trying to recover:
Setting up udev (151-12.2) ...
mknod: `/lib/udev/devices/ppp': Operation not permitted
dpkg: error processing udev (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of plymouth:
 plymouth depends on udev (>= 149-2); however:
  Package udev is not configured yet.
dpkg: error processing plymouth (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 udev
 plymouth

In this error message we see that the udev.postinst script is trying to make a node in /dev/, which we don't want it to do.

There is probably a more graceful way to fix this, but for now I'm quite happy to hack it outta my way by editing /var/lib/dpkg/info/udev.postinst and putting an exit 0 before anything else is done in the script. Once that's done just reconfigure it and it should work:

# dpkg --configure udev
# dpkg --configure plymouth

Wednesday, November 10, 2010

Using AutoFS to mount EncFS over CIFS

Background

My hosting provider offers backup space accessible using FTP, SSH/SFTP, and CIFS. I want to make use of this space, but I don't want to put potentially sensitive data on there unencrypted, so I went looking for a solution.

I considered using LUKS, but it's not ideal because in this case I'd have to create a disk image on the remote storage which would be remounted via a loop device locally. In addition it would be complicated to resize the remote image.

I found EncFS to be a suitable alternative. It provides an encrypted filesystem in user-space, and works on file level instead of on a block level. This way there's no need to determine your block device size in advance.

I won't go into how to set up an EncFS mount as there's plenty of documentation out there for this.

First round: EncFS over CurlFTPfs

Initially I thought the hosting provider only gave FTP access to the backup space, so I went about stacking EncFS on top of CurlFTPfs - a most inelegant solution. It appeared to work, however I found that eventually doing an rsync through all the layers was causing problems. I tried limiting the rsync bandwidth with the --bwlimit option, but it only delayed the problem. I guessed this was CurlFTPfs so I did some more digging.

Second round: EncFS over CIFS/Samba

I found there was also CIFS (Samba) access to the backup space, so I tried out the same EncFS on top of the CIFS mount and it worked great. No more problems. So I wanted to automate this as much as possible, removing the extra logic to make sure the CIFS mount is in place before mounting the EncFS space.

AutoFS

Unfortunately this is where the bad news comes (although there is also good news later). I tried various approaches but was unable to get EncFS to mount via AutoFS, although it does work through fstab. This appears to be due to how EncFS and AutoFS manage creation of the mount point.

So the current setup is to use AutoFS to mount the CIFS share, and to mount the EncFS share with it's own idle handling logic.

The relevant configs are:
/etc/auto.master

/net  /etc/auto.hetzner  --timeout=150

/etc/auto.hetzner

ht-backup-raw  -fstype=cifs,credentials=/etc/backup/samba.auth  ://unique_id.your-backup.de/backup

EncFS

Creating an automated way to ount the EncFS space also required some hackery. There is an option to use an external password program, however this can't be referenced in /etc/fstab, so an additional FUSE wrapper script needs to be created. This script receives 4 arguments. This script I've written is not perfect, but enough for the task:

/usr/local/sbin/encfs-extpass

#!/bin/bash

SOURCE="$1"
MOUNTPOINT="$2"
ARG1="$3"
OPTIONS="$4"

if echo $4 | grep -q 'autofs'; then
       MOUNTPOINT_PATH=/$(basename $2)
       OPTIONS=${4/autofs,/}
       encfs --extpass="/etc/encfs${MOUNTPOINT_PATH}.sh" "$1" "${MOUNTPOINT_PATH}" -o "$OPTIONS"
else
       encfs --ondemand --idle=1 --extpass="/etc/encfs${2}.sh" "$1" "$2" -o "$4"
fi

You can see my feeble attempt in there to debug and mount via autofs, but it didn't work. I leave it there for others to try if they wish.

The final bit of the magic is:

/etc/fstab

encfs-extpass#/net/ht-backup-raw  /ht-backup  fuse  defaults  0 0

Here I have a line in /etc/fstab that uses FUSE to call my custom script, which in turn mounts my encfs space, but without an interactive prompt. This is because I have a script at /etc/encfs/ht-backup.sh that echo's out the password, which is all encfs needs. I make sure the password script is chmod 400 and owned by root and my password is relatively safe.

Conclusion

Now when the system comes up the EncFS space is mounted automatically. I haven't tested the idle unmounting thoroughly yet, but in theory it should work.

Thursday, September 23, 2010

Rolling upgrades of a gentoo system

I have a gentoo system I built back in 2000 or 2001 that has served many purposes in my home. It started out as my main workstation, then when I moved it became a remote server, and now it's back in my home as a media center/ltsp server/wan router/nas/etc.

I've always had it mirror raided and migrated the disks between motherboards for upgrades, but essentially it's the same old installation from 10 years ago. About once a year or two I do a full emerge world, and I've just completed one today.

Overall I started the upgrade about a week ago and have learned some useful practices this time around.

I use a couple of laptops with distcc which I boot via LTSP so I can easily keep the glibc and gcc versions in line between systems. These systems improve the build times massively since my main system is a fanless Via C7 at 1.2Ghz.

The first step is to upgrade any critical services independently so you can keep an eye on them. When doing this you should emerge --newuse --deep --update (atom) to ensure as many of the related packages are rebuilt too.

I also like to use a few other flags for convenience: emerge -va --newuse --deep --update --tree --keep-going --jobs=4 (atom).

The --tree flag is a cosmetic addition so I can visualize the dependencies of the build plan before approving it (-va).
The --keep-going flag allows building of subsequent packages so as much as possible gets done.
The --jobs=4 flag allows multiple non-dependent packages to be merged simultaneously - this can speed up things. I also tried using the --load-average=6.0 setting, but this was causing my distcc slaves to block compiles - I suspect because the master NFS server was too busy coordinating the compiles.

Some caveats I faced while upgrading were:

The dev-lang/mono package is buggy. I was moving from dev-lang/mono:1 to dev-lang/mono:2 and it kept failing to compile. It turned out that this is a known issue and compiling a new mono instance will make use of a pre-existing installation causing the failure. The workaround is to unmerge the old one before emerging the new one!
The media-libs/libcanberra package doesn't like to be distributed or run with anything greater than MAKEOPTS="-j1".

Overall with these tips it should be possible to do a clean world update in less than a week ;)

Wednesday, April 28, 2010

Upgrading iDRAC firmware (Dell IPMI)

Upgrading the Dell iDRAC (IPMI/BMC) firmware on a non-RedHat system is a painful experience.
DISCLAIMER: This process bypasses all the checks, licenses agreements, notes and anything else meant to protect you from yourself. Only carry out these steps if you really do know what you're doing!

UPDATE: These instructions aren't currently working on Debian Squeeze AMD64.

Right - so here's the steps:

Download the latest firmware from support.dell.com
Prepare a Debian sub-environment. For instructions see http://anothersysadmin.wordpress.com/2008/02/22/howto-install-dell-openmanage-system-administrator-on-exotic-linux-distributions/
1. The instructions are for a Debian Etch 4.0 environment. You can easily use those instructions to build the latest Debian environment by substituting etch for the latest distro, e.g. lenny.
2. Take a backup of your new debian sub-environment. It'll save your life when least expected while working with commercial, so-called "open" tools, that don't work anywhere except on RedHat.
3. You can clear out a few unused packages as detailed after the chroot command below. You can also clear out downloaded package files before packaging your environment for re-use.

Ensure a few mounts are in place within your Debian sub-environment:

mount -o bind /dev /debian/dev
mount -t proc none /debian/proc
mount -t sysfs none /debian/sys

Make sure the necessary modules are loaded for the actual flashing:
```
modprobe ipmi-devintf
modprobe ipmi-si
```

Enter the Debian sub-environment

chroot /debian /bin/bash

At this point you can purge a few unnecessary apps so they don't interfere with your main environment, and to save a bit of space:

# aptitude purge bsd-mailx cron exim4 exim4-base exim4-config \
exim4-daemon-light iptables klogd laptop-detect logrotate \
sysklogd tasksel tasksel-data

And add a few useful tools: sysvconfig provides RedHat-like service controls; vi(m), well, enough said..

# aptitude install sysvconfig vim
UPDATE: if you're running a 64bit system you'll also need to install 32bit support:
# aptitude install ia32-libs rpm

Finally clean out unnecessary package files from /var/cache/apt/.
```
# aptitude clean
```

Unpack the new firmware package and apply the firmware update (it's a good idea to do this in a screen session to avoid being interrupted):
```
# screen -R -DD
# ./IDRAC6_FRMW_LX_R257033.BIN --extract /tmp/idrac
# cd /tmp/idrac
# ./bmcfwul -i=payload/firmimg.d6
```

The final command, bmcfwul, can take a long time and may appear to hang. Be patient - it will report back eventually.
You will have to make sure the ipmi-devintf and ipmi-si kernel modules are loaded in your main OS instance

That's it - you should now have an upgraded firmware.

Note that you must unmount /debain/dev, /debian/proc and /debian/sys before packaging.

# umount /debian/dev
# umount /debian/proc
# umount /debian/sys
# tar -vcjf /tmp/debian.tbz2 /debian

References:

Monday, March 22, 2010

MySQL table & index sizes

This almost works for getting the table stats - needs a tweak to correctly report database and index space usage, but don't have time to check that now.

select  
table_name, engine, table_rows as tbl_rows, avg_row_length as rlen,  
floor((data_length+index_length)/1024/1024) as allMB,  
floor((data_length)/1024/1024) as dMB,  
floor((index_length)/1024/1024) as iMB  
from information_schema.tables  
where table_schema=database()  
order by (data_length+index_length) desc;

Thursday, February 4, 2010

Grow linux md raid5 with mdadm --grow

Growing an mdadm RAID array is fairly straight forward these days. There's a few limitations, depending on your setup, and I strongly recommend you read the mdadm man page in addition to the notes here.

A couple of the limitations include:

raid arrays in a container can not be grown, so this excludes DDF arrays
arrays with 0.9x metadata are limited to 2Gb components - the total size of the array is not affected though

Before you start it's a good idea to run a consistency check on the array. Depending on the size of the array this can take a looong time. On my 3 x 1Tb RAID5 array this usually takes around 10 hours with the default settings. You can explore tweaking the settings, though I haven't done this for checks yet. We will see how we can tweak the settings for the reshape later on.

Running a consistency check is done as follows. I don't have the sample mdstat output at this time but have included the command for consistency.

# echo check >> /sys/block/md4/md/sync_action
# cat /proc/mdstat

You'll see if any errors were corrected in the array parity in the dmesg output and/or kernel logs.

Once the check is complete you should be safe to grow the array. First you have to add a new device to it so there is a spare drive in the set.
mdadm --add /dev/md3 /dev/sdc1

The event will appear in the dmesg output and the spare will show up in mdstat:

# dmesg
md: bind

# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md3 : active raid5 sdc1[s] sdb1[0] sda1[2] hdd1[1]
      1953519872 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

Now the spare is there - you can give the command to grow the array:
# mdadm --grow /dev/md3 --backup-file=~/mdadm-grow-backup-file.mdadm --raid-devices=4

The array now starts reshaping. You can monitor progress:

# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md3 : active raid5 sdc1[3] sdb1[0] sda1[2] hdd1[1]
      1953519872 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  3.3% (33021416/976759936) finish=1421.0min speed=10341K/sec

In dmesg you should see something like this:

# dmesg

RAID5 conf printout:
 --- rd:4 wd:4
 disk 0, o:1, dev:sdb1
 disk 1, o:1, dev:hdd1
 disk 2, o:1, dev:sda1
 disk 3, o:1, dev:sdc1
md: reshape of RAID array md3
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
md: using 128k window, over a total of 976759936 blocks.

Before tweaking the speed settings, now is a good time to edit your /etc/mdadm.conf file with the new ARRAY changes so it's recognized and started on your next reboot.

Now we can tweak the speed settings to speed up the reshape time. I played around with a few settings and found the following to be good for my own system.


# echo 8192 >> /sys/block/md3/md/stripe_cache_size
# echo 15000 >> /sys/block/md3/md/sync_speed_min
# echo 200000 >> /sys/block/md3/md/sync_speed_max

On my system this cut about a third off the predicted finish time:

# cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]
md3 : active raid5 sdc1[3] sdb1[0] sda1[2] hdd1[1]
      1953519872 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  4.0% (39983488/976759936) finish=957.6min speed=16303K/sec

It seems values greater than 8192 for stripe_cache_size were more harmful than beneficial on my system. It's not clear to me if this is CPU bound or bandwidth to the drives, though looking at older posts I suspect both can play a roll.

Also note that reducing the stripe_cache_size may not occur immediately when you echo a smaller value to the file. I had to echo smaller values several times before the value was adopted. This was on kernel 2.6.32.7.

You can monitor the stripe_cache_active file to see how filled the cache is:

# cat /sys/block/md3/md/stripe_cache_active
7136

When the reshape is complete you will still need to grow the file system (or volume group if you use LVM) contained in there. I'll document that tomorrow when my reshape is complete ;)

Wednesday, January 20, 2010

Virtual Inbox in Thunderbird

Right - I've finally figured out how to set up a virtual inbox in ThunderBird3 that centralizes messages from multiple accounts and folders into a single location.

Select an exsting folder on an existing IMAP account
From the menu select File > New > Saved Search
Decide where you want to keep the Virtual Inbox. It can't be top-level, so I choose Local Folders as the parent.
Name the folder, such as Virtual Inbox.
Use the Choose button to select the source folders of your messages
Select "Match all messages"
Done

Wednesday, January 6, 2010

Hack OTA installation of BlackBerry applications

Finally found a way to install BB apps when the data plan doesn't allow the BB browser to work properly.

I wrote a little bash script to do the dirty work for me. Works a treat for me :) But I can't guarantee it won't hose your phone :p

The script can be retreived from github at http://github.com/jinnko/get-bb-ota-install/blob/master/get-bb-ota.sh

Monday, January 4, 2010

Hard disk upgrade quick reference

Howto copy entire system to a new disk


echo cp -a `/bin/ls -1Ab | egrep -v '^(new|dev|proc|vols|sys)$'` /new/