Why write this HowTo
Backing up a Linux box to another Linux box, the slug, should be simple. However, there is a pitfall, a 2GB limit on the file size that can be created on the slug. This limits the options available when making backups. I spent a lot of time trying simple backup plans only to be repeatedly stymied by this limit. I have presented my best workaround for this plus other methods that I tried and abandoned. The aim is a simple backup routine.
It is worth looking through WhatPeopleAreReallyUsingTheirSlugsFor for backup methods other people use.
This HowTo is based on experience of backing up a home Linux box to a slug. Therefore you should view this HowTo in this light. Others probally have a greater knowledge and experience of backing up systems and could probally improve on what I have suggested here.
2GB limit appears to be gone, you may ignore that part of the discussion.
Backup with rsync, method that works well
Note: I do not think that this method gets completely around the 2GB limit, rather it avoids it by backing up individual files instead of creating one large tar archive. I have not checked whether it can back up individual files of more than 2GB, I suspect not. They are rare on my machine and I do not find it a problem if they are not backed up.
Description and pros and cons
This uses rsync in daemon mode running on the slug. It is probally not as secure as the SSH method of using rsync, but does not rely on NFS. Note that this method mirrors the disk of the machine being backed up. If you delete a file from the machine you are backing up it will be deleted from the slug's mirror next time you run the backup script. It is therefore not useful for recovering files you accidentally deleted a week last Tuesday.
Install rsync on the slug
ipkg install rsync
Set up an account to store backup data (on the slug)
I created an account through the Linksys interface called 'backup'. Use whatever name you like, I will refer to it as 'backup', or /backup when referring to it's home directory throughout this document. Substitute the name you chose if it is different. This is probally not essential, but it gives you somewhere tidy to put the data and you will also be able to access it via the web interface (with some limitations due to permissions).
Make a sub-directory in /backup named after the host name of the Linux machine you wish to backup. In my case /backup/phoenix.
Set up rsync in daemon mode (on the slug)
I set rsync running in daemon mode on the slug. My configuration file, /opt/etc/rsyncd.conf is given here:
Note that the user and group are set to root, though I have used chroot. I found this was necessary to maintain user and group id's when backing up. Don't compress files that do not compress well.
Set rsyncd to run at startup
Enable the Rsync daemon
Change RSYNC_ENABLE=false to RSYNC_ENABLE=true
The script (on the Linux box to back up)
The script runs on the machine to be backed up. It copies data from this machine to the slug using rsync. This script is /root/slug-backup/rsync-phoenix on my machine.
The exclude file
This is an essential file that goes with the backup script. It is used to exclude transitory files from the backup and also items that I do not feel need to be backed up. Note the url in the comments in the file. Modify it for your own needs. On my machine the file is called /root/slug-backup/rsync-phoenix-exclude.
# Files and directories to exclude, things # that don't matter if they are lost, or
The cron job
Rather than having to remember to back up the machine have cron do it automatically for us. Add this line to root's crontab entry on the Linux box (crontab -e). This example runs at 0430.
Hint: The output from rsync will be sent to root by email. This is useful feedback that the system is working. However, run it the first time from a shell/console to avoid being send an extremly large email listing almost all the files on your system. Pipe to /dev/null in the crontab entry to avoid the email.
Restore from backup
I have written nothing to carry out this task. The idea is that whatever strategy is used it should be possible, once the Linux machine can see the slug one way or the other (NFS likely) then the command line can be used to restore data simply with either cp or rsync. If your Linux box can support USB disks and ext3 you could mount the slug's disk directly for the restore.
Notes on permissions and the web interface
Unfortunately the web interface will not be able to browse all files as it expects them to all belong to user 'backup' on the slug. Instead the numercial user and group id's match the Linux box being backed up, limiting the access to the web interface. Do not mess with the permissions on the backed up data unless you are prepared to change them back if you need to carry out a restore from the slug. This is all probally a bad plan and best avoided, just accept that you cannot see all files from the web interface.
Other backup methods that work less well
Caveat: I have trimmed, shortened and tidied the scripts quoted below for brevity. Therefore they are not presented as tested on my system. They are for reference. Being as they did not work fully anyway this is not a great issue.
All of these backup methods here fell foul on a 2GB limitation on file size on the slug. They may be of use if backing up data which produces a tar archive of less than 2GB or the problem is solved. This occured with directories mounted from the slug via NFS and SMB (whether as shipped or Unslung). This problem also limited backups using tar piped through SSH.
tar onto SMB share
SMB mount of NSLU2 share (stock, not Unslung) onto Linux box, using tar to backup Linux box to NSLU. The result was that backup stopped once the tar file reached 2GB.
To backup to a SMB mount rather than NFS use the NFS example (next subsection) but replace the bit in the begining that mounts the network drive with the following:
tar onto NFS mount
Unslung NSLU2, installed NFS, repeated above but with NFS mount.
Result, stopped once the tar file reached 2GB. Note this is a somewhat shortened simplified version of the script I tried, I tar'ed up /home and the rest of the system separately. The below gives the basic idea. Note that the slug has NFS installed and running and an entry in /etc/exports to allow the Linux box to mount the backup directory.
Note: I tried copying 4GB file to NSLU2 using the above NFS mounted directory as a destination. Result, it stopped once the destination file reached 2GB.
tar with the archive split into 1GB chunks
Avoid this like the plague. It gets around the 2GB limit by producing the archive in 1GB (approximate) chunks. It will not work with compression and probally needs a shortish script to perform a restore. It is not something that I could restore solely from the command line (easily).
Note that the example below uses a directory mounted from the slug using SMB. It is likely to make no difference if NFS is used in its place.
A script is required, move-archive, to rename/move each 1GB (ish) chunk, otherwise it will be overwritten by the next one. The script below was written in Tcl. I am better at Tcl than bash! For tidyness I should have added 'close $io' to the end of the script.
tar piped through SSH
Backup up from the Linux box to the slug using a script containing the following:
I have not mentioned subtleties like generating keys etc. Suffice to say this method was very slow and broke when the resultant archive exceeded 2 GB. However, it is probally the most secure. A Google search will find web pages giving more detailed examples.
It may be possible to write a script that logs into the machine to be backed up from the slug and then backs up from the slug using tar. The slug, not the Linux box would do all the work. This may have some merit.
I note that in general things are a lot slower than if I plugged the disk straight into the Linux box. If you are interested in speed issues read Performance for more information. Note that I have not performed any clock modifications on my slug.
If you backup your Linux box to your slug and backup your slug to your Linux box watch that you don't backup these backups to their original machines, which then get backed up... Backups will take forever and you will quickly eat up even a 300GB disk.
The 2GB limit seems to be gone
I know that more recent versions of smbfs no longer have a 2GB limit. I tested with my Gentoo linux box, mounting the NSLU2 share with the 'lfs' option, which gets passed to smbmount. I was able to successfully transfer a 40 GB file to the slug (cat /dev/hda > /nslu2/share/big_file.dat). Could've done more, but that took all night as it was. Anyway, the version of samba that came with my slug seems to no longer have this limit. I'm using Samba 3.0.14 and Linux kernel version 2.6.12 on my Gentoo box. Earlier versions probably work as well. Also, the CIFS driver in the newer Linux kernel does not have that limit either, but for me, it seemed to have stability issues, though it is much faster than the older smbfs module.
Notes on rsync
One nifty technique that can help recover 'files you accidentally deleted a week last Tuesday' is described in Mike Rubel's web page 'Easy Automated Snapshot-Style Backups with Linux and Rsync' (http://www.mikerubel.org/computers/rsync_snapshots/). A tool that implements this technique is described in InstallRSnapshot.
I was curious as to the effect of the -z flag on the performance of the rsync backup, as it seemed that the slug might be compute-bound. Likewise, the rsync algorithm itself trades compute power for bandwidth, so I wondered if it might be useful to turn it off with the -W flag. Here are the results of some performance tests backing up a week's worth of miscellaneous changes to my 2.5GB /home:
So, the -z flag reduces the amount of data sent but takes the longest time to complete. Best performance is obtained with -W, though at the expense of the largest amount of network traffic.
rsync over SSH
If you run rsync over SSH, you are probably CPU-bound, so using a fast cipher is worth it. Use “arcfour” as encryption cipher for SSH - this is the weakest encryption that OpenSSH offers, but it’s still overkill for my local network (encryption “none” apparently has to be patched into the Debian-provided OpenSSH). And a quote:
“You can get an idea of how fast the various ciphers are on your system by running ‘openssl speed aes rc4 blowfish’. RC4 (aka ‘arcfour’) is usually fastest.”
== other wiki pages discussing backup ==