Docker named volume backup strategy

Published on:

A cron job backs up the named volumes of all running containers to S3.

It loops through all the running containers and does the following.

  1. Stops the container
  2. Loops over all the containers named volumes
  3. Backs up each volume to S3
  4. Starts the container

If you are running dockers on servers you control (as opposed to something like Amazon container manager or Google container engine) this strategy may make sense.

If we were running in a managed environment, we would need to put all this logic inside a docker. Which we may yet do.

We use an S3 bucket that has versioning turned on, so it keeps every backup for x number of days. You can restore just the latest version or a specified version.

cron

Run the backup every night at 8 AM UTC, and send the output to syslog.

0 8 * * * PATH=/usr/local/bin:$PATH backup_all_running_containers.sh 2>&1 | /usr/bin/logger -t container_backup

backup_all_running_containers.sh

#!/usr/bin/env bash
set -e
# Backs up all named docker volumes for all running docker containers.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
RUNNING=$(docker ps -q)
for ID in $RUNNING; do
  echo "ID:   $ID"
  NAME=$(basename $(docker inspect --format='{{.Name}}' $ID))
  echo "NAME: $NAME"
  $DIR/container_backup.sh $NAME
done

container_backup.sh

#!/usr/bin/env bash
set -e
# Backs up all named volumes for docker container name or id.
# See container_br_inc.sh for full documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/container_br_inc.sh
doit backup

container_restore.sh

#!/usr/bin/env bash
set -e
# Restores up all named volumes for docker container name or id.
# See container_br_inc.sh for full documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/container_br_inc.sh
doit restore

container_br_inc.sh

#!/usr/bin/env bash
set -e

# Meant to be included by container_backup.sh and container_restore.sh

# Stops docker container.

# Backs up or restores all named volumes for the given docker container
# name or id. We don't backup or restore un-named volumes.

# If restore, we always restore the most recently backed up version.
# For more control over what and how volumes are restored see
# container_restore.sh

# Starts docker container.

# If you pass a second parameter we skip restarting and leave the container
# stopped.

# First parameter is docker container name or id.
CONTAINER=$1
# If anything is passed for second parameter we don't restart conatiner.
SKIP_RESTART=$2
# Space delimited list of docker volumes or blank if this container has no
# volumes.
VOLUMES=$(docker inspect --format='{{range .Mounts}}{{.Name}} {{end}}' $CONTAINER)
HAS_NAMED_VOLUME=false

doit() {

  if [ -z "$VOLUMES" ]; then
    echo "no volumes to $1"
    exit 0
  fi

  for VOL in $VOLUMES; do
    LEN=${#VOL}
    if (( $LEN < 40 )); then
      # If it's a short name we assume it's a named volume. If it's longer
      # than 40 we assume it's a volume that docker named with a UUID.
      HAS_NAMED_VOLUME=true
    fi
  done

  if [ "$HAS_NAMED_VOLUME" = false ]; then
    echo "no named volumes to $1"
    exit 0
  fi

  # We will always want to stop the container.
  docker stop $CONTAINER

  for VOL in $VOLUMES; do
    LEN=${#VOL}
    if (( $LEN < 40 )); then
      CMD="volume_$1.sh $VOL"
      echo "$CMD"
      RESULT=$(eval $CMD)
      echo "$RESULT"
    else
      # We don't backup or restore unnamed volumes.
      echo "skipping un-named volume: $VOL"
    fi
  done

  # May not always want to restart afterwards.
  if [ -z "$SKIP_RESTART" ]; then
    echo "restarting $CONTAINER"
    docker start $CONTAINER
  else
    echo "skipping restart"
  fi
}

volume_backup.sh

#!/usr/bin/env bash
set -e

# Backs up the given docker named volume to S3.
# See volume_br_inc.sh for complete documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/volume_br_inc.sh

# See if docker volume directory exists.
if [ -d "$VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data" ]; then
  # Do the actual backup. Look ma, streaming to S3!
  # We pipe data from tar directly to s3. No writing tar file to disk.
  # Comes in handy in a pinch when disk space is low.
  tar -cz -C $VOLUME_PREFIX/$DOCKER_VOLUME_NAME _data \
    | aws s3 cp - $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz --sse

  echo "Backup success"
  echo "Volume location: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
  echo "S3 location:     $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz"
else
  echo "Can't backup a folder that ain't there: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
fi

volume_restore.sh

#!/usr/bin/env bash
set -e

# Restores the given docker named volume from S3 backup.
# See volume_br_inc.sh for complete documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/volume_br_inc.sh

S3_VERSION=$2
mkdir -p $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data

if [ -z "$S3_VERSION" ]; then
  # No version passed
  # Move the current volume data to a safe place in case our restore
  # was a mistake. We can always move it back.
  TS=$(date '+%Y_%m_%d_%H-%M-%S')
  SAFE_KEEPING=/tmp/restored_volumes/$DOCKER_VOLUME_NAME
  mkdir -p $SAFE_KEEPING
  cd $VOLUME_PREFIX/$DOCKER_VOLUME_NAME
  if [ -d _data ]; then
    mv _data $SAFE_KEEPING/$TS/
  else
    echo "no _data to move to SAFE_KEEPING"
  fi

  # Do the actual restore.
  # We stream directly to S3. No intermediate tar file on disk.
  # Comes in handy in a pinch when disk space is low.
  aws s3 cp $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz - \
    | tar xvfz -

  echo ""
  echo "We we moved the existing volume data under /tmp for safe keeping."
  echo "You may want to clean up the safe keeping folder from time to time."
  echo "SAFE_KEEPING/TS: $SAFE_KEEPING/$TS"
  echo ""
else
  # Version passed
  aws s3api get-object \
    --bucket $BUCKET \
    --key $S3_PATH/$DOCKER_VOLUME_NAME.tgz \
    --version-id "$S3_VERSION" \
    "$DOCKER_VOLUME_NAME-$S3_VERSION.tgz"
  echo ""
  echo "We restored the specified version to current directory."
  echo "You can inspect it and if it looks good manually replace"
  echo "the docker volume with this backup version."
  echo ""
fi

echo "Restore success"
echo "Volume location: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
echo "S3 location:     $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz"

volume_br_inc.sh

#!/usr/bin/env bash
set -e

# Meant to be included by volume_backup.sh and volume_restore.sh

# Backs up or restores the named docker volume to Amazon S3.

# Docker volumes are stored in a location like the following.

# /var/lib/docker/volumes/my_vol_name/_data

# We don't actually care if this directory is a true docker volume.
# As long as it exists we back it up.
# Even if it doesn't exist, we will restore to it.
# This allows you to copy data to the specified location on a server and
# back it up even if it's not a real docker volume. Then you can easily restore
# it to another server and create a real docker volume there and then run your
# container.

# We backup/restore to an S3 location like the following.

# s3://some_bucket/some_path/my_vol_name.tgz

# Restore operates in 2 modes.

# If you just pass the name of the docker volume we will do mode one.

# 1. move that volume to safe keeping under /tmp
# 2. restore the latest backup version of the volume

# If you pass as a second parameter the s3 object version number we will
# do mode two.

# 1. download that version to the current directory.

# You can inspect the restore to make sure it's what you are looking for,
# and then you can replace the bad docker volume with the backup manually.

# If you run this from the /var/lib/docker/volumes/my_vol_name folder
# all you will need to do to restore is

# 1. rm -rf _data
# 2. tar xvfz my_vol_name.tgz.

VOLUME_PREFIX='/var/lib/docker/volumes'
BUCKET='my-backups-bucket'
S3_PATH='docker-volume-backups'
S3_PREFIX="s3://$BUCKET/$S3_PATH"

if [[ $# -eq 0 ]]; then
  echo ""
  echo "  usage: backup.sh DOCKER_VOLUME_NAME"
  echo ""
  exit 1
fi

DOCKER_VOLUME_NAME=$1