Olga has finished her (second) MSc thesis and sent it to her committee. Of course that’s cause for celebration, and I’m very proud of her. But I wanted to talk about what happened the day she submitted the final copy, which can be summed up thusly: I deleted everything she had written.

How I managed this appalling cockup is simply told: I meant to issue a cp command to copy everything from my laptop (where I was doing final typographic work) to the USB stick she kept her master copy on, but instead I typed rm (remove), deleting both my local copy and the USB master in one fell swoop.

Oops.

Now this should have been embarrassing but painless: restore the latest backup and carry on. And yes, we were taking backups, but no, they didn’t save the day. Figuring out why has convinced me that having a half-arsed backup strategy is actually more dangerous than having none at all, because it gives you a false sense of security which can make it much easier to make stupid mistakes.

Here’s what we had for a backup system:

  • the entire thesis in a mercurial repository on my laptop, so that any ghastly errors in the typographic adjustments I was making could easily be rolled back;
  • the USB copy of everything, which also contained the mercurial repository; and
  • a copy (likewise of everything) on Olga’s laptop.

Each of these failed us for different reasons.

  • The mercurial repository was just a local repository, in a .hg/ directory beside the files. My rm removed it.
  • The USB copy, likewise, was destroyed (as was everything else on the stick: cp -r my_dir /media/my_stick/ is very convenient, but the rm-variant of the same is utterly merciless).
  • The copy on Olga’s laptop was several days out of date, during which days we had done the (rather grueling) last proofreading and typographic adjustments (annoying things like producing ten extra words to move a paragraph break down a line to allow a figure to appear on the page facing where it was discussed).

By blind luck, we didn’t have to redo that work: I had been editing the files in emacs, and the buffers were still open. By yet more blind luck, that evening’s session (although short) had involved every file that we had changed since that earlier version on Olga’s laptop (as far as we can tell). So all the files that we had no up-to-date backup of could be resaved from emacs, leading directly (I suspect) to the continuing success of our relationship.

What have I learned from this? Firstly, that the established wisdom about keeping backups in triplicate, at least one off-site, is not just about protecting yourself from hardware failures, fires, theft, and similar uncommon catastrophes. It’s also about maintaining a safety barrier between two copies, so that a (much more common) typo or momentary foolishness that destroys one cannot, at the same time, affect the other. If you can easily affect two copies of your work with one commandline instruction, they count as one copy for backup purposes: neither is backing up the other. With one command I deleted my local copy of the thesis, the local mercurial repository, and copy-and-mercurial-repository on the USB stick: four copies of the text, but (according to this rule) no backups at all.

Secondly, your backup should not be involved in your daily workflow, except purely as a backup. The way you access it should be standardised and scripted and as impossible to mess up as you can manage. Copying files from place to place is error-prone. Issuing hg commit then hg push is better. Having a background process automate the whole backup system is better still.

Finally (and again this is established wisdom), a backup you don’t take is no backup at all. This is another argument for automating the whole process, although I’m halfway willing to use mercurial and careful discipline to get the same effect.