zfs-freeze: External atomic backups

• 2 minute read • backuplinux

I use zfs-send for backups, but not all destinations1 support it. Backing up live systems is problematic — while non-atomic ordered rollbacks are business as usual, segmented backups lead to corrupted databases and broken restores.

Twelve-Factor Apps included: even if state isn’t derived from multiple sources, references can still break. For example, when blob storage is backed up a minute before the database, the restored application could hold a broken reference to an image, which doesn’t exist.

In small (yet sizable) systems, surgical strategizing is unnecessary. With the power of $ zfs snapshot tank/apps/{db,s3}/cutie2 and zfs-send, we are done. But what’s the shiny fridge all about?

Staying in the ecosystem, we trust ZFS with all copies of our data. To manage the risk, I backup to a technologically isolated system. However, generic tooling doesn’t provide the same consistency guarantees; thus I created a script to expose latest snapshots with multiple datasets.

The bash prototype wasn’t compatible with setuid3. Satisfying the binary requirement, I reimplemented it in Go, with extra features.

Illustrative usage of zfs-freeze:

# Dataset I want to backup: tank/top (and all sub-datasets)

# One-time setup
go install github.com/jtagcat/jtagcat/compile-scripts/zfs-freeze@latest
sudo chown root:root $GOPATH/bin/zfs-freeze
sudo chmod 4755 $GOPATH/bin/zfs-freeze
sudo mv $GOPATH/bin/zfs-freeze /usr/local/bin

zfs create tank/_freeze_top

# Regular backups:
frozen_dataset="$(zfs-freeze tank/top restic-cup-"$(date --rfc-3339=date)")" # will create tank/_freeze_top/restic-cup-2023-10-03
function cleanup {
  zfs-freeze -d "$frozen_dataset"
trap cleanup EXIT

restic -r cup: -p crypt.key backup "/$frozen_dataset"
Templates for Hetzner: rsync and restic
frozen_dataset="$(zfs-freeze tank/top restic-cup-"$(date --rfc-3339=date)")"
function cleanup {
  zfs-freeze -d "$frozen_dataset"
trap cleanup EXIT
rsync -e "ssh -p23 -oUserKnownHostsFile=$HOME/.ssh/known_hosts -oIdentityFile=hetzner_$HETZNER_USER.key" \
      --archive --delete --compress \
      --exclude=.stversions \
      --relative -- /"$frozen_dataset"/./{dirs,to,backup}/ \
restic -r sftp:: -p restic.key -o sftp.command="ssh $HETZNER_USER.your-storagebox.de -p23 -oIdentityFile=hetzner_$HETZNER_USER.key -l $HETZNER_USER -s sftp" unlock --quiet
restic -r sftp:: -p restic.key -o sftp.command="ssh $HETZNER_USER.your-storagebox.de -p23 -oIdentityFile=hetzner_$HETZNER_USER.key -l $HETZNER_USER -s sftp" backup --quiet --exclude-file "restic.excludes" "/$frozen_dataset"

Writing the helper, I yearned to use hcli: I could’ve avoided errors in parsing arguments, and skipped a refactor.

  1. Exhibit A: Hetzner’s (cheap!) pure ZFS storage boxes ↩︎

  2. Helpers such as Sanoid and zrepl, do not snapshot across datasets atomically, only in rapid succession ($ zfs snapshot tank/apps/db/cutie && zfs snapshot tank/apps/s3/cutie). ↩︎

  3. Outside of Nix, setuid wrappers are a hassle. ↩︎