Docker named volume backup strategy

Published on:

A cron job backs up the named volumes of all running containers to S3.

It loops through all the running containers and does the following.

  1. Stops the container
  2. Loops over all the containers named volumes
  3. Backs up each volume to S3
  4. Starts the container

If you are running dockers on servers you control (as opposed to something like Amazon container manager or Google container engine) this strategy may make sense.

If we were running in a managed environment, we would need to put all this logic inside a docker. Which we may yet do.

We use an S3 bucket that has versioning turned on, so it keeps every backup for x number of days. You can restore just the latest version or a specified version.

cron

Run the backup every night at 8 AM UTC, and send the output to syslog.

0 8 * * * PATH=/usr/local/bin:$PATH backup_all_running_containers.sh 2>&1 | /usr/bin/logger -t container_backup

backup_all_running_containers.sh

#!/usr/bin/env bash
set -e
# Backs up all named docker volumes for all running docker containers.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
RUNNING=$(docker ps -q)
for ID in $RUNNING; do
  echo "ID:   $ID"
  NAME=$(basename $(docker inspect --format='{{.Name}}' $ID))
  echo "NAME: $NAME"
  $DIR/container_backup.sh $NAME
done

container_backup.sh

#!/usr/bin/env bash
set -e
# Backs up all named volumes for docker container name or id.
# See container_br_inc.sh for full documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/container_br_inc.sh
doit backup

container_restore.sh

#!/usr/bin/env bash
set -e
# Restores up all named volumes for docker container name or id.
# See container_br_inc.sh for full documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/container_br_inc.sh
doit restore

container_br_inc.sh

#!/usr/bin/env bash
set -e

# Meant to be included by container_backup.sh and container_restore.sh

# Stops docker container.

# Backs up or restores all named volumes for the given docker container
# name or id. We don't backup or restore un-named volumes.

# If restore, we always restore the most recently backed up version.
# For more control over what and how volumes are restored see
# container_restore.sh

# Starts docker container.

# If you pass a second parameter we skip restarting and leave the container
# stopped.

# First parameter is docker container name or id.
CONTAINER=$1
# If anything is passed for second parameter we don't restart conatiner.
SKIP_RESTART=$2
# Space delimited list of docker volumes or blank if this container has no
# volumes.
VOLUMES=$(docker inspect --format='{{range .Mounts}}{{.Name}} {{end}}' $CONTAINER)
HAS_NAMED_VOLUME=false

doit() {

  if [ -z "$VOLUMES" ]; then
    echo "no volumes to $1"
    exit 0
  fi

  for VOL in $VOLUMES; do
    LEN=${#VOL}
    if (( $LEN < 40 )); then
      # If it's a short name we assume it's a named volume. If it's longer
      # than 40 we assume it's a volume that docker named with a UUID.
      HAS_NAMED_VOLUME=true
    fi
  done

  if [ "$HAS_NAMED_VOLUME" = false ]; then
    echo "no named volumes to $1"
    exit 0
  fi

  # We will always want to stop the container.
  docker stop $CONTAINER

  for VOL in $VOLUMES; do
    LEN=${#VOL}
    if (( $LEN < 40 )); then
      CMD="volume_$1.sh $VOL"
      echo "$CMD"
      RESULT=$(eval $CMD)
      echo "$RESULT"
    else
      # We don't backup or restore unnamed volumes.
      echo "skipping un-named volume: $VOL"
    fi
  done

  # May not always want to restart afterwards.
  if [ -z "$SKIP_RESTART" ]; then
    echo "restarting $CONTAINER"
    docker start $CONTAINER
  else
    echo "skipping restart"
  fi
}

volume_backup.sh

#!/usr/bin/env bash
set -e

# Backs up the given docker named volume to S3.
# See volume_br_inc.sh for complete documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/volume_br_inc.sh

# See if docker volume directory exists.
if [ -d "$VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data" ]; then
  # Do the actual backup. Look ma, streaming to S3!
  # We pipe data from tar directly to s3. No writing tar file to disk.
  # Comes in handy in a pinch when disk space is low.
  tar -cz -C $VOLUME_PREFIX/$DOCKER_VOLUME_NAME _data \
    | aws s3 cp - $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz --sse

  echo "Backup success"
  echo "Volume location: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
  echo "S3 location:     $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz"
else
  echo "Can't backup a folder that ain't there: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
fi

volume_restore.sh

#!/usr/bin/env bash
set -e

# Restores the given docker named volume from S3 backup.
# See volume_br_inc.sh for complete documentation.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/volume_br_inc.sh

S3_VERSION=$2
mkdir -p $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data

if [ -z "$S3_VERSION" ]; then
  # No version passed
  # Move the current volume data to a safe place in case our restore
  # was a mistake. We can always move it back.
  TS=$(date '+%Y_%m_%d_%H-%M-%S')
  SAFE_KEEPING=/tmp/restored_volumes/$DOCKER_VOLUME_NAME
  mkdir -p $SAFE_KEEPING
  cd $VOLUME_PREFIX/$DOCKER_VOLUME_NAME
  if [ -d _data ]; then
    mv _data $SAFE_KEEPING/$TS/
  else
    echo "no _data to move to SAFE_KEEPING"
  fi

  # Do the actual restore.
  # We stream directly to S3. No intermediate tar file on disk.
  # Comes in handy in a pinch when disk space is low.
  aws s3 cp $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz - \
    | tar xvfz -

  echo ""
  echo "We we moved the existing volume data under /tmp for safe keeping."
  echo "You may want to clean up the safe keeping folder from time to time."
  echo "SAFE_KEEPING/TS: $SAFE_KEEPING/$TS"
  echo ""
else
  # Version passed
  aws s3api get-object \
    --bucket $BUCKET \
    --key $S3_PATH/$DOCKER_VOLUME_NAME.tgz \
    --version-id "$S3_VERSION" \
    "$DOCKER_VOLUME_NAME-$S3_VERSION.tgz"
  echo ""
  echo "We restored the specified version to current directory."
  echo "You can inspect it and if it looks good manually replace"
  echo "the docker volume with this backup version."
  echo ""
fi

echo "Restore success"
echo "Volume location: $VOLUME_PREFIX/$DOCKER_VOLUME_NAME/_data"
echo "S3 location:     $S3_PREFIX/$DOCKER_VOLUME_NAME.tgz"

volume_br_inc.sh

#!/usr/bin/env bash
set -e

# Meant to be included by volume_backup.sh and volume_restore.sh

# Backs up or restores the named docker volume to Amazon S3.

# Docker volumes are stored in a location like the following.

# /var/lib/docker/volumes/my_vol_name/_data

# We don't actually care if this directory is a true docker volume.
# As long as it exists we back it up.
# Even if it doesn't exist, we will restore to it.
# This allows you to copy data to the specified location on a server and
# back it up even if it's not a real docker volume. Then you can easily restore
# it to another server and create a real docker volume there and then run your
# container.

# We backup/restore to an S3 location like the following.

# s3://some_bucket/some_path/my_vol_name.tgz

# Restore operates in 2 modes.

# If you just pass the name of the docker volume we will do mode one.

# 1. move that volume to safe keeping under /tmp
# 2. restore the latest backup version of the volume

# If you pass as a second parameter the s3 object version number we will
# do mode two.

# 1. download that version to the current directory.

# You can inspect the restore to make sure it's what you are looking for,
# and then you can replace the bad docker volume with the backup manually.

# If you run this from the /var/lib/docker/volumes/my_vol_name folder
# all you will need to do to restore is

# 1. rm -rf _data
# 2. tar xvfz my_vol_name.tgz.

VOLUME_PREFIX='/var/lib/docker/volumes'
BUCKET='my-backups-bucket'
S3_PATH='docker-volume-backups'
S3_PREFIX="s3://$BUCKET/$S3_PATH"

if [[ $# -eq 0 ]]; then
  echo ""
  echo "  usage: backup.sh DOCKER_VOLUME_NAME"
  echo ""
  exit 1
fi

DOCKER_VOLUME_NAME=$1

Docker

Published on:

Docker

This is my attempt to understand the ever changing world of docker.

Docker is an ever moving target and there are lots of examples of outdated
ways of doing things. Hopefully this will stay sort and up to date.

Storage and data volumes

Docker containers should be portable and immutable. This presents a challenge
for storage. A database needs to write its files somewhere. If it writes them
inside the container we break immutability. If it writes them outside the
container we break portability.

Performance is another consideration. The union file system is pretty slow.
Where performance counts, we need to bypass the union file system.

There are many ways to crack the docker storage nut. The best way depends on
our needs.

Mutable data in the container

If we don't care about performance and we don't care about immutability the most
simple thing (maybe see if you agree after reading below) is to store mutable
data in the container itself.

Let's say file-upload is an app that lets you to upload and download files.

Since we store the data in the container, moving the container to another server
looks like this.

  1. stop the container
  2. build an image from the stopped container
  3. push the image to a docker repo
  4. run the image on another server

or this

  1. stop the container
  2. copy the data out of the container
  3. move the data to the new server
  4. run the container on the new server

However, upgrading the file-upload app to version 3.0.0 is not so simple.

  1. stop the container
  2. copy the files out of the container
  3. start version 3.0.0 of the container
  4. copy the files into the container

Finally performance won't be that great, but we don't care.

WARNING!!! One very large caveat with storing data directly in your
container is when you remove the container your data is lost forever. Stopping
and restarting the container is safe, but removing it will also delete your
data.

Data volumes

Data volumes are docker's way to bypass the union file system and store data
directly on the host file system. This is much faster than the union file
system, and it allows your containers to be immutable.

With this approach moving our container to another server is as follows.

  1. stop the container
  2. move the data to the other server
  3. run the container on the other server

Upgrading our file-upload app is a simple as.

  1. stop the old version
  2. start the new version

Docker will never delete data stored in a data volume. Even removing a container
won't delete its volume's data. This can cause disk space issues.

Creating data volumes

There are two ways to create a data volume.

  1. in Dockerfile with the VOLUME instruction
  2. docker run -v

in the Dockerfile

VOLUME ["/some/path/in/my/container/"]

If you were to run a container based on this dockerfile and then run
docker inspect you would see something like the following. The mount Name will be
the id of the container, and it will be mapped to some path on the host system,
Source, and to the path you specified in the container, Destination.

"Mounts": [
        {
            "Name": "d4bf12ed6684da1f20a9eefcdf2a4f11987bbd1aa0e36007ab4454449f81bf30",
            "Source": "/mnt/sda1/var/lib/docker/volumes/d4bf12ed6684da1f20a9eefcdf2a4f11987bbd1aa0e36007ab4454449f81bf30/_data",
            "Destination": "/some/path/in/my/container",
            "Driver": "local",
            "Mode": "",
            "RW": true
        }
    ],

docker run -v

docker run -v /some/path/in/my/container/ busybox

This does exactly the same thing as the first way, and you will see the same
output from docker inspect.

The great thing about this is performance. Now you are bypassing the union file
system and you will get fast access.

Another good thing is that you can stop your container and even delete it
without losing your data. You can upgrade to version 2.0.0 without having to
copy data out of your old container into the new (sort of).

But this isn't very convenient because when you start your new container it will
have a new id and it will create a new path on the host based on that new id. In
order to use the data from the previous container you will need to copy it from
the old path.

Mapping data volumes to host paths

So a better way than the above is to map the container path to the host path
in a consistent way.

You have two choices for mapping a container path to the host system.

  1. you can specifiy the host file system path
  2. you can create a named volume and let docker determine the host file system path

un-named or user mapped volumes

If you want a volume to point to a specific path on the host file system
you can do the following where ~/ refers to the host file system and
/tmp refers to the container file system.

docker run -it -v ~/:/tmp busybox

You must use absolute paths for the container path.

You can use absolute or relative paths for the host file path. However,
relative paths must begin with ./ or ~/. If you just specify somepath it
creates a named volume which is very different and is explained below.

Inspecting the container will display the following.

"Mounts": [
        {
            "Source": "/home/docker",
            "Destination": "/tmp",
            "Mode": "",
            "RW": true
        }
    ],

Ahhhhhh, now we can upgrade to version 2.0.0 just by pointing the new
container to the same path. Look ma, no copying data!

named volumes

There are at least two ways you can create a named volume.

docker volume create --name=myvol

This second way both creates myvol if it doesn't exist and maps it to the
container path.

docker run -v myvol:/tmp busybox

Either of the above commands will create a volume named myvol.

# docker volume inspect myvol
[
    {
        "Name": "myvol",
        "Driver": "local",
        "Mountpoint": "/mnt/sda1/var/lib/docker/volumes/myvol/_data"
    }
]

This has the exact same result as you specifying the host path, but you let
docker determine the host path.

NOTE: While it may seem more convenient to specify the host path yourself --
maybe because you like having it under your home directory -- for
production applications it's generally more work. If you are deploying to
hundreds of servers it's easier to let docker create the host path so you
don't have to.

This will delete the volume.

docker volume rm myvol

Error response from daemon: Conflict: volume is in use

You will likely see the error because docker won't let you delete a volume
if there are any containers (running or stopped) that refer to it.

Data volume containers

Docker seems to push the idea of data volume containers. I don't know why.
Seems to be no advantage over named data volumes. I think it must be outdated
advice from before there were named data volumes.

The current docker documentation says:

If you have some persistent data that you want to share between containers, or
want to use from non-persistent containers, it’s best to create a named Data
Volume Container, and then to mount the data from it.

So the idea here is to create a container that doesn't do anything but specify
one or more (the examples use unnamed) volumes. Other containers can use
docker run --volumes-from to use the volume in the data volume container.

docker create -v /my/path --name file-upload-data file-upload /bin/true
docker run -d --name file-upload-app1 --volumes-from file-upload-data file-upload
docker run -d --name file-upload-app2 --volumes-from file-upload-data file-upload

The above creates a container file-upload-data
that just sits there and isn't even running. Then it runs two containers,
file-upload-app1 and file-upload-app2 that store any data they write
to /my/path to the volume owned by file-upload-data.

How is this better than having the two containers just use the same named
data volume?

Jenkins trigger build on tag creation

Published on:

We want Jenkins to trigger builds from github any time a new tag is created.

In the Jenkins job we need to know.

  • the tag name
  • the branch name

This works.

export HASH=$(git rev-parse HEAD)
export BRANCH=$(basename $(git branch -r --contains ${HASH}))
export TAG=$(basename $(git describe --all --exact-match ${HASH}))

echo "HASH: $HASH"
echo "BRANCH: $BRANCH"
echo "TAG: $TAG"

However, getting that to work is very fiddly.

Make sure you have the following values in Source Code Management - git - repositories

  • refspec: +refs/tags/*:refs/remotes/origin/tags/*
  • branch specifier: **
  • additional behaviors: wipe out repository and force clone

Under build triggers check Build when a change is pushed to GitHub

Node copy file to AWS S3

Published on:

Uses the nodejs AWS SDK to upload a file to S3.

Shows that if your EC2 instance has a role that allows write to S3 it will work even without ~/.aws/credentials and without passing credentials to new AWS.S3().

#!/usr/bin/env node

var AWS = require('aws-sdk');
var s3 = new AWS.S3();

function _uploadToS3(bucket, key, body, cb) {
  var params = {
    Bucket: bucket,
    Key: key,
    Body: body,
    ACL: 'public-read'
  };
  s3.putObject(params, function(err, data) {
    if (err) {
      console.log(err);
    }
    var url = 'https://s3.amazonaws.com/' + bucket + '/' + key;
    cb(err, url);
  });
}


_uploadToS3('dev-quote-tool', 'jds-junk', 'some body', function(err, info) {
  console.log(err);
  console.log(info);
});

Cut new release

Published on:

Python script to check stuff out of git and cut a new release.

#!/usr/bin/env python3


# PRXY=$(grep 'Acquire::http::Proxy' /etc/apt/apt.conf.d/05proxy | cut -d ' ' -f 2 | cut -d ';' -f 1 | cut -d '"' -f 2)

# export http_proxy="${PRXY}"

# export https_proxy="${PRXY}"


import argparse
import fileinput
import json
import os
import shutil
import socket
import subprocess
import sys
import urllib.request
import traceback
import tempfile

class MyAP(argparse.ArgumentParser):
  """
  This just makes the parser print out the full help message on error
  instead of just telling you what you did wrong.
  """
  def error(self, message):
    sys.stderr.write('error: %s\n' % message)
    self.print_help()
    sys.exit(2)

def parse_args():
  parser = MyAP(description=
    """
    Cuts a new HIP release (includes EPC AKA Cortex AKA Receptor)
    """)
  parser.add_argument(
    '-r',
    '--release-name',
    required=True,
    type=str,
    help='The name of the release. For example: rc_1.13')
  args = parser.parse_args(sys.argv[1:])
  # try:

  #     socket.inet_aton(args.new_ip)

  # except socket.error:

  #   parser.error('invalid ip address {ip}\n'.format(ip=args.new_ip))

  return args

# reads specified json file into dict.

def read_json(file):
   return json.load(open(file, 'r'))

def get_package_json(release_name):
  pjname = './packages/' + release_name + '.json'
  try:
    pj = read_json(pjname)
    print(json.dumps(pj, indent=3, sort_keys=True))
    return pj
  except FileNotFoundError as e:
    print('package json file not found: ' + pjname)
    sys.exit(5)
  except:
    traceback.print_exc()
    sys.exit(5)

def set_git_base_dir():
  gitbase = tempfile.gettempdir() + '/cut-release'
  print('gitbase: ' + gitbase)
  try:
    shutil.rmtree(gitbase)
  except:
    pass
  mkdir_p(gitbase)
  os.chdir(gitbase)
  return gitbase

def mkdir_p(path):
  try:
    os.makedirs(path)
  except OSError as exc: # Python >2.5

    if exc.errno == errno.EEXIST and os.path.isdir(path):
      pass
    else: raise

def git(*args):
  return subprocess.check_call(['git'] + list(args))

def fix_package_json(app_name, release_name):
  if app_name in ['lens','nerve','report_service','pupil','retina']:
    print('fixing package.json: ' + app_name)
    with fileinput.FileInput('./package.json', inplace=True) as file:
      for line in file:
        print(line.replace('#develop', '#' + release_name), end='')

# "hip-cortex-api": "git+https://github.mandiant.com/HIP/hip-cortex-api#develop",

# "hip-db": "git+https://github.mandiant.com/HIP/hip-db#develop",

# "hip-kafka-node": "git+https://github.mandiant.com/HIP/hip-kafka-node#develop",

# "hip-utils": "git+https://github.mandiant.com/HIP/hip-utils#develop",


def is_branch_created(git_url, release_name):
  try:
    out = subprocess.check_output(
      ['git', 'ls-remote', '--heads', git_url, release_name],
      stderr=subprocess.STDOUT).decode('utf-8')
    return release_name in out
  except:
    traceback.print_exc()
    sys.exit(5)

def create_or_checkout_branch(git_url, release_name):
  if not is_branch_created(git_url, release_name):
    print('creating branch: ' + release_name)
    git('checkout', '-b', release_name)
  else:
    git('checkout', release_name)

def commit_and_push(release_name):
  try:
    git('commit', '-a', '-m', 'cut ' + release_name)
  except:
    pass
  git('push', 'origin', release_name)

def do_git_stuff(app_name, release_name):
  print('app_name: ' + app_name)
  git_url = 'https://github.mandiant.com/HIP/{app_name}.git'.format(app_name=app_name)
  if (app_name == 'epc_server'):
    app_name = 'receptor'
    git_url = 'https://github.mandiant.com/EndpointConnector/{app_name}.git'.format(app_name=app_name)
  print('git_url: ' + git_url)
  print('os.getcwd: ' + os.getcwd())
  git('clone', git_url)

  os.chdir(app_name)
  print('os.getcwd: ' + os.getcwd())

  create_or_checkout_branch(git_url, release_name)
  fix_package_json(app_name, release_name)
  commit_and_push(release_name)

  os.chdir('..')

if __name__ == "__main__":
  args = parse_args()
  pkg_json = get_package_json(args.release_name)
  set_git_base_dir()
  do_git_stuff('hip-cortex-api', args.release_name)
  do_git_stuff('hip-kafka-node', args.release_name)
  do_git_stuff('hip-utils', args.release_name)
  for svc in pkg_json:
    do_git_stuff(svc['app_name'], args.release_name)

PostgreSQL table disk space

Published on:

Here are a couple of handy queries to figure out the amount of space a table is using and how much of it is wasted space.

select version();
                                                    version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit

See how much space your tables are taking.

select 
  relname, 
  pg_size_pretty(pg_total_relation_size(oid)) 
from 
  pg_class 
where 
  relkind = 'r' 
order by pg_total_relation_size(oid) desc;

                 relname                 | pg_size_pretty
-----------------------------------------+----------------
 file_instance                           | 1729 GB
 raw_message_cortex                      | 399 GB
 as_timeline_1                           | 80 GB

See how much is wasted space.

select 
  relname, 
  reltuples, 
  relpages, 
  reltuples / relpages as "T / P"
from 
  pg_class 
where 
  relkind = 'r' 
  and relpages > 0 
order by reltuples / relpages;

                relname                |  reltuples  | relpages |       T / P
---------------------------------------+-------------+----------+--------------------
 session_collection_source             |          12 |      207 | 0.0579710144927536
 sessions                              |           2 |        2 |                  1
 pg_ts_parser                          |           1 |        1 |                  1
 pg_db_role_setting                    |           2 |        1 |                  2
 pg_tablespace                         |           2 |        1 |                  2
 pg_extension                          |           2 |        1 |                  2
 pg_auth_members                       |           2 |        1 |                  2
 family_exclusion                      |           2 |        1 |                  2
 q_file_view                           |      148681 |    69478 |   2.13997236535306
 file_instance                         |  9.4325e+06 |  4339863 |   2.17345570585984 

Ansible clear out cron for current user

Published on:

Ansible clear out all crontab entries for current user. Ansible 2 has a better way.

nc.yml

- name: remove crontab

  hosts:

    - "tag_use_service"

    - ";&tag_cust_id_{{ id }}"

  tasks:

    - name: remove crontab

      shell: crontab -r

nc.sh

for i in $(seq -f "%02g" 1 24); do
  ansible-playbook nc.yml -e id=hipprd${i}
done

Ansible word wrap local_action

Published on:

Here's how to word wrap local_action in Ansible.

-

  name: test

  hosts: all


  tasks:

    - name: test word wrap local_action

      local_action: >

        get_url url="https://teamcity.jetbrains.com/update/buildAgent.zip?guest=1"

        dest="~/buildAgent.zip"

ansible install docker on ubuntu trusty 14.04

Published on:

ansible playbook to deploy docker on ubuntu trusty 14.04

# ansible playbook to deploy docker on ubuntu trusty 14.04 
---
-
  name: patch tc agent boxes
  sudo: true
  hosts:
    - 172.16.1.34
    - 172.16.1.197
  tasks:
    - name: add docker apt key
      apt_key:
        keyserver: hkp://p80.pool.sks-keyservers.net:80
        id: 58118E89F3A912897C070ADBF76221572C52609D

    - name: add docker apt repository
      apt_repository:
        repo: deb https://apt.dockerproject.org/repo ubuntu-trusty main
        update_cache: yes

    # For Ubuntu Trusty, Vivid, and Wily, it’s recommended to install the linux-image-
    # extra kernel package. The linux-image-extra package allows you use the aufs
    # storage driver.
    # https://docs.docker.com/engine/userguide/storagedriver/aufs-driver/

    # had to run manually
    # - name: allow use of aufs storage driver
    #   shell: apt-get install linux-image-extra-$(uname -r)

    - name: uninstall old docker
      apt:
        name: lxc-docker
        purge: yes
      failed_when: no

    - name: install docker
      apt:
        name: docker-engine

    - name: add ubuntu to docker group
      user:
        name: ubuntu
        groups: docker
        append: yes

    - name: add teamcity to docker group
      user:
        name: teamcity
        groups: docker
        append: yes

    - name: get docker compose
      get_url:
        url: https://github.com/docker/compose/releases/download/1.5.2/docker-compose-Linux-x86_64
        dest: /usr/local/bin/docker-compose
        mode: 0755

Bash loop over left padded numbers

Published on:

So if you have something that needs to loop over a list of left padded numbers like this.

01
02
03
04

You can do this.

for i in $(seq -f "%02g" 1 4); do
  echo ${i}
done

Manage aws security group rules

Published on:

Python script to allow you to easily add and remove AWS security group rules.

#!/usr/bin/env python3

# http://boto3.readthedocs.org/en/latest/reference/services/ec2.html#EC2.Vpc.security_groups

import boto3
from botocore.client import ClientError
import sys
import argparse
import re

class MyAP(argparse.ArgumentParser):
  """
  This just makes the parser print out the full help message on error
  instead of just telling you what you did wrong.
  """
  def error(self, message):
    sys.stderr.write('error: %s\n' % message)
    self.print_help()
    sys.exit(2)

def build_parser():
  parser = MyAP(description=
    """
    Allow or revoke inbound access for AWS security groups.
    Groups are specified by REGEX_PATTERN.
    TO_PORT defaults to FROM_PORT.
    if FROM_PORT contains multiples TO_PORT is ignored.
    CIDR_RANGES must be CIDR even for singles.
    AWS_PROFILE=hip-prod AWS_DEFAULT_REGION=us-east-1 ./add-sg-rule.py (options)
    """)
  parser.add_argument(
    '-r',
    '--regex-pattern',
    required=True,
    type=str,
    help='regex pattern to match against security group GroupName')
  parser.add_argument(
    '-p',
    '--protocol',
    default='tcp',
    type=str,
    help='either tcp or udp')
  parser.add_argument(
    '-f',
    '--from-port',
    help='starting port number. or space delimited list of port numbers.')
  parser.add_argument(
    '-t',
    '--to-port',
    help='ending port number (defaults to FROM_PORT)')
  parser.add_argument(
    '-c',
    '--cidr-ranges',
    type=str,
    help='space delimited list of source CIDR ranges')
  parser.add_argument(
    '-d',
    '--dry-run',
    action='store_true',
    help='if specified we do nothing except see if change would have worked')
  parser.add_argument(
    '--revoke',
    action='store_true',
    help='if specified we revoke access instead of grant access')
  parser.add_argument(
    '--list-matching-groups',
    action='store_true',
    help='if specified we just list what security groups matched the REGEX_PATTERN')
  return parser

def get_sgs_for_name_pattern(pattern):
  sgs = []
  p = re.compile(pattern)
  for g in ec2.describe_security_groups()['SecurityGroups']:
    if p.match(g['GroupName']):
      sgs.append(resource.SecurityGroup(g['GroupId']))
  return sgs

def authorize_ingress(args):
  sgs = get_sgs_for_name_pattern(args.regex_pattern)
  for sg in sgs:
    if args.list_matching_groups:
      print(sg.group_name)
    else:
      # we loop here rathern than specifying multiple IpRanges
      # less effecient, but duplicate errors or rule doesn't exist
      # errors won't keep is from trying the next in the list.
      # much more useful for the user.
      for r in args.cidr_ranges.split():
        perms = {
          'IpProtocol': args.protocol,
          'FromPort': int(args.from_port),
          'ToPort': int(args.to_port),
          'IpRanges': [{'CidrIp': r}]
        }
        print(str(perms))
        try:
          if args.revoke:
            sg.revoke_ingress(
              DryRun=args.dry_run,
              IpPermissions=[perms]
            )
          else:
            sg.authorize_ingress(
              DryRun=args.dry_run,
              IpPermissions=[perms]
            )
        except ClientError as e:
          print((sg.group_name + ': ').ljust(20) + str(e))

ec2 = boto3.client('ec2')
resource = boto3.resource('ec2')

if __name__ == "__main__":
    parser = build_parser()
    args = parser.parse_args(sys.argv[1:])
    if not args.list_matching_groups:
      if not args.from_port:
        parser.error('FROM_PORT required unless LIST_MATCHING_GROUPS specified\n')
      if not args.cidr_ranges:
        parser.error('CIDR_RANGES required unless LIST_MATCHING_GROUPS specified\n')
    args.from_port = '0' if not args.from_port else args.from_port
    args.to_port = args.from_port
    ports = args.from_port.split()
    if len(ports) > 1:
      for port in ports:
        args.from_port = port
        args.to_port = port
        authorize_ingress(args)
    else:
      authorize_ingress(args)

Ansible lack of variable types lame

Published on:

I consider this an egregious oversight (or neglect) on Ansible's part. The fact that you can't create a boolean fact makes playbooks horribly verbose and hard to read.

The following creates a string "True". I have not found a way around this.

- set_fact:
    force_deploy: "{{ FORCE_DEPLOY is defined }}"

Then to use it you have to cast it.

  when: force_deploy|bool

If you have a mildly complex clause it gets really ugly.

  when: restart|bool and (not_deployed|bool or force_deploy|bool)

This would be so much nicer.

  when: restart and (not_deployed or force_deploy)

Why doesn't ansible just add something like this.

- set_fact:
    force_deploy: "{{ FORCE_DEPLOY is defined }}"
        type: bool

What's worse is that most Ansible playbooks just sprinkle this all over the place.

when: restart is defined and (not_deployed is defined or force_deploy is defined)

Song List

Published on:
key ccli name
A 3915912 Beautiful One
B/C 5895580 Joyful (The One Who Saves)
D/Eb 1546892 Oh Lead Me
G/A 4556538 Everlasting God
G 2060208 Give Us Clean Hands
C/D 2240585 Let It Rise
E/F 31779 Leaning On The Everlasting Arms

(order and keys not determined)

cron send email via gmail

Published on:

Works with gmail as well as yahoo. Pretty much same settings.

http://askubuntu.com/questions/536766/how-to-make-crontab-email-me-with-output

In the end I used sSMTP. It's far, far simpler than either Postfix or sendmail and does the job beautifully.

For future reference, here's how to use sSMTP to with Yahoo Mail (don't worry, it's far less complex than it looks):

Use Synaptic to download ssmtp. Alternatively you could run sudo apt-get install ssmtp.

Open the config file at /etc/ssmtp/ssmtp.conf.

Make the config look like this:

root=[yourRealEmail@yahoo.com.au]
mailhub=smtp.mail.yahoo.com:587
FromLineOverride=YES
UseSTARTTLS=YES
AuthUser=[yourRealEmail@yahoo.com.au]
AuthPass=[yourRealYahooPassword]
TLS_CA_File=~/cert.pem
Create the cert.pem file with OpenSSL. I used the command openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 9999 -nodes (more info here). You can stick the file anywhere, but I just chucked it in ~/. Wherever you put it, make sure you point the 'TLS_CA_File=' line in ssmtp.conf to the correct location.

Open the file /etc/ssmtp/revaliases and add the line [yourPCUsername]:[yourRealEmail@yahoo.com.au]:smtp.mail.yahoo.com:587. If you're running as root, I would think you need to add another line replacing you name with 'root'.

That's it, you're good to go! To test, the easiest way (IMO) is to create a file with the following in it:

To: [yourRealEmail@yahoo.com.au]
From: "whateverYaWant" <[yourRealEmail@yahoo.com.au]>
Subject: Some Notifying Email
MIME-Version: 1.0
Content-Type: text/plain

Body of your email goes here! Hello world!
Save and close the file, then (to check you don't have the real sendmail installed, run sendmail -V - it should say 'sSMTP') run cat fileWithEmailInIt.txt | sendmail -i -t. Then wait a few seconds (10-30) and check your email!
Obviously, replace [yourRealEmail@yahoo.com.au] with your email (without the brackets) and [yourRealYahooPassword] with your Yahoo Mail password (again, without the brackets).

Ansible PostgreSQL 9.4 Client

Published on:

main.yml

---
- name: add postgresql repository
  apt_repository:
    repo: "deb http://apt.postgresql.org/pub/repos/apt/ {{ ansible_lsb.codename }}-pgdg main 9.4"
- name: import postgres repository signing key
  register: repo_key
  apt_key:
    url: http://apt.postgresql.org/pub/repos/apt/ACCC4CF8.asc
- name: update apt cache
  when: repo_key.changed
  apt:
    update_cache: yes
- name: install postgres client
  apt:
    name: postgresql-client-9.4
- name: tempate .pgpass file
  sudo_user: ubuntu
  template:
    src: .pgpass
    dest: /home/ubuntu/.pgpass
    mode: 0600

.pgpass

{{ id }}.cliprqr4ifrx.us-east-1.rds.amazonaws.com:5432:postgres:postgres:{{ db.admin_pass }}
{{ id }}.cliprqr4ifrx.us-east-1.rds.amazonaws.com:5432:hipdb:hipuser:{{ db.pass }}

upstart log to syslog

Published on:

Haven't tried this yet, but . . .

script
  mkfifo /tmp/myservice-log-fifo
  ( logger -t myservice </tmp/myservice-log-fifo & )
  exec >/tmp/myservice-log-fifo
  rm /tmp/myservice-log-fifo
  exec myservice 2>/dev/null
end script

http://serverfault.com/a/316649/302690

JavaScript Mostly Unique Id

Published on:

This will generate a URI safe mostly unique id.

function makeid() {
  var id = "";
  // uri_unreserved per http://www.ietf.org/rfc/rfc3986.txt
  var uri_unreserved = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~";

  for (var i = 0; i < 5; i++)
    id += uri_unreserved.charAt(Math.floor(Math.random() * uri_unreserved.length));

  return id;
}

console.log(makeid());

http://stackoverflow.com/a/1349426

Docker Quick Reference

Published on:

Hello World

docker run hello-world

Common Commands

docker info                   # about docker and host
docker run -it ubuntu bash    # run bash shell
docker start aa3              # restart a stopped container
docker attach aa3             # get back into your bash shell

# run in background mode
docker run -d ubuntu /bin/sh -c "while true; do echo hello world; sleep 1; done"
docker logs --tail 20 -f aa3  # tail -f the logs

# log to syslog
docker run -d \
  --log-driver="syslog" \
  ubuntu \
  /bin/sh -c "while true; do echo hello world; sleep 1; done"

sudo tail -f /var/log/syslog  # to see your logs

docker top d0a                # like ps aux
docker stats die93 29dk3      # like top

docker stop $(docker ps -q)   # stop all
docker rm $(docker ps -a -q)  # remove all

Build and install docker from source

Followed this more or less. Some names were outdated.
http://tristan.lt/blog/docker-4-build-docker-from-sources/

Errata

v1.8.0-rc1

https://github.com/docker/docker/blob/6a274e48dc645f0ea02ae8bf59ce08ff22cfd663/daemon/logger/syslog/syslog.go#L62

syslog-address
syslog-facility
syslog-tag