Zenhack.net

Too Many Links

25 Oct 2015

Today, my backup system finally choked. I haven’t lost any data, but without some tweaking, I won’t be able to make any more backups. In this post, I’m going to explain how the system works and what went wrong.

The System

I use a program I wrote myself to do backups. The main features are:

  1. Dedup files by their content.
  2. Fast incremental backups.
  3. Bandwidth friendly, if the backup filesystem is remote.

I store backups in an ext4 filesystem on a RAID 1 array with 3TiB of logical storage. It’s in an external enclosure attached to my desktop via USB 3. I keep my other machines synchronized with my desktop, and do periodic backups of my home directory on my desktop.

Feature (1) is really important to me; it lets me take what look like full backups at frequent time intervals, without having to worry about running out of space quickly. The backups are currently using about 520 GiB of space, and have only grown by about 40 GiB since I started using the system a few years ago. They’re not much bigger than the current version of my home directory.

Feature (2) is really nice as well; I can do a backup of nearly 500 GiB of data once a week, and each backup takes a minute or two at the absolute maximum.

Feature (3) isn’t really important for my specific setup, since I’m only dealing with local storage, but it was easy enough to build, and could be very useful in other contexts.

The Algorithm

Given paths to:

Descend recursively through src. For each file f:

  1. if old was provided, see if the same file exists with the same modification time. if so, make dest/f a hard link to old/f, and move on to the next file.
  2. Otherwise, compute a cryptographic hash of the contents of f, which we’ll call hash(f).
  3. Look for a file blobs/hash(f). If found, make dest/f a hard link to that file. Otherwise, copy f to dest/f and make blobs/hash(f) a hard link to dest/f.

Step (1) provides a dramatic speedup when doing an incremental backup, but the result should otherwise be the same. The fact that we compute the hash locally and look for an existant file makes this very bandwidth friendly; we only ever copy files that aren’t already there somewhere.

We also do some nice things like copy over permissions/ownership, modification times, and so on. For directories, we just make directories with the same names. For symlinks, we make a link with the same name and target. Other kinds of files (fifos, devices, sockets…) are just omitted from the backup.

The Failure

When running a backup today, I got the following error message:

dedup-backup:  /backup/2015-09-15/path:  createLink: resource exhausted (Too many links)

(As you can see, I’ve been lazy and haven’t done a backup in over a month. Oops.)

I investigated:

$ ls -l /backup/2015-09-15/path
-rw-r--r-- 65000 isd users 0  9. Aug 2006  /backup/2015-09-15/path

Oh look, A file of size zero with 65,000 hard links to it.

What’s going on?

Hard links are tracked using reference counting; the contents of a file are freed only when all of the links to it have been deleted. Ext4 uses as 16-bit integer to store the number of links, which would only permit numbers up to 2^16 - 1, or 65,535. The linux kernel actually has a compiled-in limit of 65,000. Because my backup program uses hard links to coalesce all files with the same contents, sooner or later it’s bound to hit this limit. Empty files are common enough that it would be surprising if any other file hit this first.

What now?

At this point, I can think of a couple ways to proceed:

  1. Switch from ext4 to a filesystem with a larger maximum link count. Assuming the growth in link count within my backups is roughly linear, a 32-bit value should be enough to last me the rest of my life.
  2. Modify the program to do something a bit smarter re: large numbers of links.
  3. Switch to a differnt backup system entirely.
  4. Switch to a filesystem that does de-duplication itself (e.g. btrfs).

All of these have pros and cons. Everything but (2) invovles moving my existing backups. (1) and (4) require reformatting the drives, which means I’d need to find somewhere to store the data in the interm. (3) is likely to not have this problem, since there’s enough storage on the existing filesystem to store several full copies of the current data. (2) requires the most work on my part, and could be tricky to pull off.

If switching to a new backup system, I’d like to maintain all three of the core features mentioned above. Right now I’m dumping the existing backups into camlistore, with its storage on the same drive. It definitely has feature (1), and I’m less worried about corrupting the data than I am with my current system. It will also be more generally flexible; it will be pretty simple to back up things remotely, and with the third-leg feature I might even be able to start thinking about offsite backups. The initial import is taking a long time. It remains to be seen whether incremental backups will be fast enough. If this works out I’ll just stick with it; camlistore is something I’ve been itching to play with anyway, and this will give me an excuse.

I’ll report back at some point when I’ve figured out a long term solution.