« Ship It: Lignières History Book Published | Main | Reading List: The Captive Mind »

Monday, December 18, 2006

Linux: Remote backup bandwidth (dump/tar) to LTO3 tape over SSH depends critically upon block size

In getting ready to leave Fourmilab unattended while I'm out of town during the holiday season, I habitually make a site-wide backup tape which I take with me, “just in case”—hey, the first computer I ever used was at Case! This was the first time I've made such backups since installing the new backup configuration, so I expected there would be things to learn in the process, and indeed there were.

While routine backups at Fourmilab are done with Bacula, for these cataclysm hedge offsite backups I use traditional utilities such as dump and tar because, in a post-apocalyptic scenario, it will be easier to do a “bare metal” restore from them. (The folks who consider their RAID arrays adequate backup have, in my opinion, little imagination and even less experience with Really Bad Days; not only do you want complete offsite backups on media with a 30 year lifetime, you want copies of them stored on at least three continents in case of the occasional bad asteroid day.)

Anyway, this was the first time I went to make such a backup to the LTO Ultrium 3 drive on the new in-house server. I used the same parameters I'd used before to back up over the network to the SDLT drive on the Sun server it replaced, and I was dismayed to discover that the transfer rate was a fraction of what I was used to, notwithstanding the fact that the LTO3 drive is much faster than the SDLT drive on the former server.

The only significant difference, apart from the tape drive, is that remote tape access to the old server went over rcmd, while access to the new server is via ssh. Before, I had used a block size in the dump of 64 Kb, as this was said to be universally portable. With this specification, I saw transfer rates on the order of 1300 kilobytes per second, at which rate it would take sixteen hours to back up my development machine. Given that my departure time was only few hours more than that from the time I started the backup and that I had another backup to append to the tape, this was bad.

According to a cursory Web search, it appears that the most common recommendation for an optimum block size for LTO3 tape is 256 Kb. Now, if you read the manual page for dump, there is a bit of fear that things may go horribly awry if you set the block size greater than 64 Kb, but I decided to give it a try and see what happened. What happened was glorious! The entire 40 Gb dump completed at an average rate of 4817 Kb/sec; this isn't as fast as Bacula, but considering that everything is going through an encrypted SSH pipeline it's entirely reasonable. The actual command I used to back up a typical file system is like this:

    /sbin/dump -0u -b 256 -f server.ratburger.org:/dev/nst0 \
       /dev/hda7
Should the need arise to restore from such a backup, it is essential that you specify the “-b 256” option on the command line so that the block size of the dump will be honoured. I verified that I could restore files backed up with these parameters before using them to save other file systems.

Commands like this can be used to back up Unix file systems, but legacy file systems such as VFAT and NTFS partitions on dual-boot machines may have to be backed up using TAR. To back up an NTFS file system on my development machine which I mount with Linux-NTFS, I use the following commands:

    cd /c
    tar -c --rsh-command /usr/bin/ssh \
        -f server.ratburger.org:/dev/nst0 -b 512 .

Posted at December 18, 2006 02:09