Simple Linux backup solution

My desktop computer, which runs Gentoo Linux, has a 120 GB drive (used for my / and /boot and swap partitions) and a 480 GB drive (used for /home).  I decided to add another drive for backups.  Ended up with a 720 GB drive, which is bigger than the others combined with room to spare, and only cost me about $100.

(Backing up to another hard drive on the same box is easy, and fast, and protects from a single drive failing, but does not protect against the computer being stolen or the house burning down.  So I still need to push really important stuff, like source code, offsite.)

I mount the new drive as /mnt/backup.  I use rsync to backup everything, minus a few excluded directories, to /mnt/backup/{TIMESTAMP}.  I tell rsync to use hard links when a file is the same as the file in the previous backup.  That way I have a full backup tree per day, but the incremental hard disk space used is only equivalent to the size of the files that actually changed that day.

I control the backup with this Python script (sh or perl would work too, but I like Python), which lives at /usr/local/bin/backup.py:

#!/usr/bin/python

"""Backup most of a filesystem to a backup disk, using rsync."""

import subprocess
import time
import os

SOURCE = "/"
DESTINATION = "/mnt/backup"
EXCLUDES = [
    "/proc",
    "/sys",
    "/lost+found",
    "/mnt",
    "/media",
    "/tmp",
    "/var/tmp",
    "/var/run",
    "/var/lock",
]
RSYNC = "/usr/bin/rsync"

def find_latest_destdir():
    latest = 0
    for fn in os.listdir(DESTINATION):
        if fn.isdigit() and len(fn) == 14:
            timestamp = int(fn)
            latest = max(timestamp, latest)
    if latest:
        return str(latest)
    return None

def main():
    cmd = [RSYNC]
    cmd.append("-ab")
    for exclude in EXCLUDES:
        cmd.append("--exclude=%s" % exclude)
    latest = find_latest_destdir()
    if latest:
        cmd.append("--link-dest=%s" % (os.path.join(DESTINATION, latest)))
    cmd.append(SOURCE)
    timestamp = time.strftime("%Y%m%d%H%M%S")
    cmd.append(os.path.join(DESTINATION, timestamp))
    print cmd
    returncode = subprocess.call(cmd)


if __name__ == "__main__":
    main()

And here's the crontab line that runs it every night at 3 a.m, added with "sudo crontab -e":

0 3  * * *      /usr/local/bin/backup.py

Automated nightly backups are now so easy (thanks to tools like rsync, and the cheapness of hard drives) that there's really no excuse not to do them.