Running docker¶
Nota Bene
You must be using a computer with docker installed to complete the exercises on this page. If you are attending the BU workshop, refer to the page on connecting to your EC2 instance for instructions on how to SSH into your instance.
Your First Docker Container¶
Containers are run using the command:
$ docker run <image name>[:<tag>]
The <image name>
must be a recognized docker image name either on the local
machine or on Docker Hub. The optional :<tag>
specifies a particular
version of the image to run.
Exercise
Run a container for the hello-world
docker image hosted on Docker Hub.
If you need help, try running docker
and docker run
without any
arguments to see usage information.
Read the text output by the container after it has been run.
Pulling docker images¶
As part of running a container from a public docker image, the image itself is pulled and stored locally. This only occurs once for each version of an image; subsequently run containers will use the local copy of the image.
If you have never run any docker containers in this environment before, there
should be no local images listed by the docker images
command:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
$
To verify that the hello-world
image has been pulled, we again use the
docker images
command after running the container:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest 2cb0d9787c4d 2 weeks ago 1.85kB
$
This output tells us that we have the latest version of the hello-world
image in our local registry.
We can pull images explicitly, rather than doing so implicitly with a
docker run
call, using the docker pull
command:
$ docker pull nginx
This may be useful if we do not want to run a container immediately, or want to perform our own modifications to the image locally prior to running.
Exercise
Pull the nginx image using the docker pull
command. Verify that the
latest image of nginx has been pulled using docker images
.
Managing docker containers¶
Running detached containers¶
The hello-world
container runs, prints its message, and then exits. If we
were running a docker container that provided a service, we would want the
container to persist running until we chose to shut it down. An example of
this is the nginx web server, which we can run with the command:
$ docker run -d -p 8080:80 nginx
Here, the -d
flag tells docker to keep the container running and return
control to the command line when it is finished setting up the container. The
-p 8080:80
means forward port 80, the default port for HTTP traffic, on the
container to the unrestricted port 8080 on the local machine. When control has
returned to the command line, we can verify that the container is still running
using the docker ps
command:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
49af27e82231 nginx "nginx -g 'daemon of…" 4 minutes ago Up 4 minutes 0.0.0.0:8080->80/tcp elastic_mcnulty
$
Exercise
Run an nginx container as above. Verify that the container is running with
docker ps
.
If specified correctly, the local port 8080 should behave as if it is a web server. Verify that this is the case by running:
$ curl localhost:8080
Attaching data volumes to containers¶
Scientific analyses almost always utilize some form of data. Docker containers are intended to execute code, and are not designed to house data. Directories and data volumes that exist on the host machine can be mounted in the container at run time to enable the container to read and write data to the host:
$ docker run -d -p 8080:80 --mount type=bind,source="$PWD"/data,target=/ nginx
The directory named data
in the current host directory will be mounted
as /data
in the root directory of the container.
Stopping running containers¶
When a docker container has been run in a detached state, it runs until it is
stopped or encounters an error. To stop a running container, we need either the
CONTAINER ID
or NAMES
attribute of the running container from
docker ps
:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
49af27e82231 nginx "nginx -g 'daemon of…" 4 minutes ago Up 4 minutes 0.0.0.0:8080->80/tcp elastic_mcnulty
$ docker stop 49af72e82231 # could also have provided elastic_mcnulty
49af27e82231
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$
Stopping a container sends signals to the container that it should start shutting down, so once a container is stopped it usually cannot be started again.
Nota Bene
Docker maintains a record of all containers that have been run on a machine.
After they have been stopped, docker ps
does not show them, but the
containers still exists. To see a list of all containers that have been run,
use docker ps -a
.
It is good practice to remove old containers if they are no longer needed. You
can do this with the command docker container prune
.
Creating docker images¶
Building a custom image¶
Chances are there is not an existing docker container that does exactly what you
want (but check first!). To create your own image, you must write a Dockerfile.
As an example, we will create an image that has the python package scipy_
installed for us to use. It is common convention to create a new directory
named for the the image you wish to create, and create a text file named
Dockerfile
in it. In the scipy directory, our Dockerfile contains:
# pull a current version of python3
FROM python:3.6
# install scipy with pip
RUN pip install scipy
# when the container is run, put us directly into a python3 interpreter
CMD ["python3"]
To build this docker images, we use the docker build
command from within
the scipy directory containing the Dockerfile:
$ docker build --tag scipy:latest .
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM python:3.6
---> 638817465c7d
Step 2/3 : RUN pip install scipy
---> Running in 1eef65d3b6fd
Collecting scipy
Downloading https://files.pythonhosted.org/...
Collecting numpy>=1.8.2 (from scipy)
Downloading https://files.pythonhosted.org/...
Installing collected packages: numpy, scipy
Successfully installed numpy-1.15.0 scipy-1.1.0
Removing intermediate container 1eef65d3b6fd
---> 7f34e9147bef
Step 3/3 : CMD ["python3"]
---> Running in 5c9d778426e6
Removing intermediate container 5c9d778426e6
---> e27603f4ffaf
Successfully built e27603f4ffaf
Successfully tagged scipy:latest
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
scipy latest e27603f4ffaf About a minute ago 1.15GB
python 3.6 638817465c7d 25 hours ago 922MB
$
The --tag scipy:latest
argument gives our image a name when it is listed in
docker images
. Notice also that the python:3.6
image has been pulled in
the process of building the scipy image.
Now that we have built our image, we can run and connect to the image using
docker run
with two additional flags:
$ docker run -i -t scipy
Python 3.6.0 (default, Jul 17 2018, 11:04:33)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>>
The -i
flag tells docker we want to use the container interactively, and
the -t
flag connects our current terminal to the container so that we may
send and receive information to and from the terminal.
Exercise
Create a new Dockerfile where you will install the most recent version of
R. Use ubuntu:bionic as the base image. You may follow
these instructions, without using the sudo
command.
Hint: Use a different RUN
line for each command.
Passing containers CLI arguments¶
The CMD
Dockerfile command specifies a standalone executable to run when a
container starts. However, sometimes it is convenient to be able to pass
command line arguments to a container, for example to run an analysis pipeline
on different files, or files with filenames that are not known at build time.
For instance, if you we might want to run the following:
$ docker run python process_fastq.py some_reads.fastq.gz
The CMD
command does not allow command line arguments to be passed to the
run command. Instead, the ENTRYPOINT
command is used to prefix a set of
commands to any command line arguments passed to docker:
FROM python:3.6
# we will mount the current working directory to /cwd when the container is run
WORKDIR /cwd
RUN pip install pysam
# ENTRYPOINT instead of CMD
ENTRYPOINT ["python3"]
Any command line arguments passed to docker will be appended to the command(s)
specified in the ENTRYPOINT
.
If a container is intended to run files that exist on the host, the docker run
command must also be supplied with a mount point so the container can access
the files. In the example above, the WORKDIR
is specified as /cwd
,
so we can bind the current working directory of the host to /cwd
in the
container so it can access the files process_fastq.py
and some_reads.fastq
in the current directory:
$ docker run -mount type=bind,source=$PWD,target=/cwd process_fastq.py some_reads.fastq