This is my attempt to understand the ever changing world of docker.
Docker is an ever moving target and there are lots of examples of outdated
ways of doing things. Hopefully this will stay sort and up to date.
Storage and data volumes
Docker containers should be portable and immutable. This presents a challenge
for storage. A database needs to write its files somewhere. If it writes them
inside the container we break immutability. If it writes them outside the
container we break portability.
Performance is another consideration. The union file system is pretty slow.
Where performance counts, we need to bypass the union file system.
There are many ways to crack the docker storage nut. The best way depends on
Mutable data in the container
If we don't care about performance and we don't care about immutability the most
simple thing (maybe see if you agree after reading below) is to store mutable
data in the container itself.
file-upload is an app that lets you to upload and download files.
Since we store the data in the container, moving the container to another server
looks like this.
- stop the container
- build an image from the stopped container
- push the image to a docker repo
- run the image on another server
- stop the container
- copy the data out of the container
- move the data to the new server
- run the container on the new server
However, upgrading the
file-upload app to version 3.0.0 is not so simple.
- stop the container
- copy the files out of the container
- start version 3.0.0 of the container
- copy the files into the container
Finally performance won't be that great, but we don't care.
WARNING!!! One very large caveat with storing data directly in your
container is when you remove the container your data is lost forever. Stopping
and restarting the container is safe, but removing it will also delete your
Data volumes are docker's way to bypass the union file system and store data
directly on the host file system. This is much faster than the union file
system, and it allows your containers to be immutable.
With this approach moving our container to another server is as follows.
- stop the container
- move the data to the other server
- run the container on the other server
file-upload app is a simple as.
- stop the old version
- start the new version
Docker will never delete data stored in a data volume. Even removing a container
won't delete its volume's data. This can cause disk space issues.
Creating data volumes
There are two ways to create a data volume.
- in Dockerfile with the VOLUME instruction
- docker run -v
in the Dockerfile
If you were to run a container based on this dockerfile and then run
docker inspect you would see something like the following. The mount
Name will be
the id of the container, and it will be mapped to some path on the host system,
Source, and to the path you specified in the container,
docker run -v
This does exactly the same thing as the first way, and you will see the same
The great thing about this is performance. Now you are bypassing the union file
system and you will get fast access.
Another good thing is that you can stop your container and even delete it
without losing your data. You can upgrade to version 2.0.0 without having to
copy data out of your old container into the new (sort of).
But this isn't very convenient because when you start your new container it will
have a new id and it will create a new path on the host based on that new id. In
order to use the data from the previous container you will need to copy it from
the old path.
Mapping data volumes to host paths
So a better way than the above is to map the container path to the host path
in a consistent way.
You have two choices for mapping a container path to the host system.
- you can specifiy the host file system path
- you can create a named volume and let docker determine the host file system path
un-named or user mapped volumes
If you want a volume to point to a specific path on the host file system
you can do the following where
~/ refers to the host file system and
/tmp refers to the container file system.
You must use absolute paths for the container path.
You can use absolute or relative paths for the host file path. However,
relative paths must begin with ./ or ~/. If you just specify
creates a named volume which is very different and is explained below.
Inspecting the container will display the following.
Ahhhhhh, now we can upgrade to version 2.0.0 just by pointing the new
container to the same path. Look ma, no copying data!
There are at least two ways you can create a named volume.
This second way both creates
myvol if it doesn't exist and maps it to the
Either of the above commands will create a volume named
This has the exact same result as you specifying the host path, but you let
docker determine the host path.
NOTE: While it may seem more convenient to specify the host path yourself --
maybe because you like having it under your home directory -- for
production applications it's generally more work. If you are deploying to
hundreds of servers it's easier to let docker create the host path so you
don't have to.
This will delete the volume.
You will likely see the error because docker won't let you delete a volume
if there are any containers (running or stopped) that refer to it.
Data volume containers
Docker seems to push the idea of data volume containers. I don't know why.
Seems to be no advantage over named data volumes. I think it must be outdated
advice from before there were named data volumes.
The current docker documentation says:
If you have some persistent data that you want to share between containers, or
want to use from non-persistent containers, it’s best to create a named Data
Volume Container, and then to mount the data from it.
So the idea here is to create a container that doesn't do anything but specify
one or more (the examples use unnamed) volumes. Other containers can use
docker run --volumes-from to use the volume in the data volume container.
The above creates a container
that just sits there and isn't even running. Then it runs two containers,
file-upload-app2 that store any data they write
/my/path to the volume owned by
How is this better than having the two containers just use the same named