Dockerfile

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

A Dockerfile is a plain text file that defines how a container should look at build time. It contains all the steps that are required to create the image.

Each line in the Dockerfile generates a new layer in the image. Multiple commands can be combined on a single line, to reduce the number of layers.

The Dockerfile is used as the argument of the docker build command.

Examples

Syntax

<instruction> <arguments>

Example

Dockerfile Example

Variables

See:

docker build | Build-Time Variables

Instructions

Instructions are also known as "directives", and they are part of a DSL.

FROM

https://docs.docker.com/engine/reference/builder/#from

A valid Dockerfile must start with a FROM instruction. ARG is the only instruction that can precede FROM. FROM specifies the base image upon which other layers are built. The identifier of the base image will appear as "Parent:" in the metadata of the resulting image.

FROM node:0.10 [AS something]

If the only Dockerfile instruction is FROM, the build command simply downloads the base image into the local registry, listing the image's repository and tag. FROM can appear multiple times in a Dockerfile.

Optionally a name can be given to a new build stage by adding AS name command. The name can be used in subsequent FROM and COPY --from=<name> instructions to refer to the image built at this stage.

Multi-arch Support

FROM seems to support an --platform=[...] argument which can be used to specify the platform, though I've seen image building systems that do not need it, as long as the image specified as FROM argument is multi-arch.

FROM --platform=${TARGETPLATFORM} docker.example.com/some-image

scratch

https://hub.docker.com/_/scratch/#
FROM scratch

ARG

Defines a build-time variables. For more details on how build-time variables are defined and declared see:

Build-Time Variables

COPY

https://docs.docker.com/engine/reference/builder/#copy
COPY <src> [src2 ...] <dest>

The COPY instruction copies new files or directories from <src> and adds them to the filesystem of the container at the path <dest>. T

The source paths are relative to the build context, but source paths that try to reach outside the context (for example ../something) are considered invalid and will cause the build to fail.

If <dest> does not exist, it will be created. If the source file is a file, and <dest> does not have a trailing slash, the content of the source file will be copied in a file named dest. If <dest> has a trailing slash, the source file will be copied under the original name in a newly created directory dest.

Each occurrence of the COPY instruction creates a new layer in the final image.

This instruction is handled in a special manner relative to the build cache.

COPY should be preferred over ADD because it is more transparent.

To copy a file from the context, under the same name in the root of the image:

COPY ./some-dir/some-file /

⚠️ If more than one source is listed, then the last argument must be a directory and end with a trailing slash "/".

If the file or directory already exists in the image and it is overwritten, it will take the new ownership.

COPY and Directory Handling

If the <src> is a directory, its content will be recursively copied, but not the source directory itself. For example, if ./dir1 contains a file f.txt and a subdirectory ./dir2,

COPY ./dir1 /opt

will create /opt/f.txt and /opt/dir2 in the image.

A trailing slash does not make a difference.

To copy the directory and its content, specify the same name as target:

COPY ./dir1 /dir1

ADD

https://docs.docker.com/engine/reference/builder/#add

Copies files from the local filesystem into the image.

ADD /something/something_else.conf $MY_PATH

This instruction is handled in a special manner relative to the build cache.

COPY should be preferred over ADD because it is more transparent.

ENTRYPOINT and CMD

ENTRYPOINT

https://docs.docker.com/engine/reference/builder/#entrypoint

The ENTRYPOINT instruction specifies the command and optionally the arguments to be used when a container is created from the image with docker run. ENTRYPOINT should be the preferred way of specifying the command to run when the container is intended to be used as executable, over the CMD instruction. ENTRYPOINT and CMD can be combined. For more details on the interaction between ENTRYPOINT and CMD, see ENTRYPOINT and CMD Interaction. If ENTRYPOINT is specified multiple times in the Dockerfile, only the last occurrence will be considered.

ENTRYPOINT has two forms: the exec form and the shell form.

ENTRYPOINT "exec" Form

The "exec" form involves specifying the entry point in JSON array format, with double-quoted (") arguments:

ENTRYPOINT ["executable", "param1", "param2"]

The docker run command line arguments, if any, are appended to the argument list specified in the "exec" form, overriding the arguments specified by CMD in "exec" form, if any. When the "exec" form is used, the executable specified as the first element runs with PID 1 in the container, and the signals are being sent directly to it. On the other hand, the shell pre-processing of the arguments does not happen, as no shell process is involved, as it is the case for the "shell" form, so arguments like "$HOME" are not expanded to the values available in the container environment.

ENTRYPOINT can be overridden on command line with:

docker run --entrypoint <other-entrypoint>

Example:

docker run --entrypoint bash -it novaordis/someimage:latest

The executable specification and its parameters propagates verbatim in the container metadata.

ENTRYPOINT ["ls", "-a", "-l"] 

will propagate as:

"Entrypoint": [
   "ls",
   "-a",
   "-l"
]

ENTRYPOINT "shell" Form

In "shell" form, the executable and the arguments are specified as plain strings, not as a JSON array:

ENTRYPOINT executable param1 param2

The "shell" form executes the arguments with "/bin/sh -c", and this becomes obvious inspecting the metadata representation:

ENTRYPOINT ls 

becomes

"Entrypoint": [
    "/bin/sh",
    "-c",
    "ls"
]

⚠️ The shell form prevents any CMD or docker run command line arguments from being used: docker run arguments are not appended. When the shell form is used, the executable will not have the PID 1, and signals will not be passed to it, which means that it won't be able to process SIGTERM sent by docker stop. However, the arguments are pre-processed by the shell, so arguments like $HOME are expanded with values present in the container's environment.

To insure that signals will be passed to the executable, and the executable runs with PID 1, "exec" can be specified to precede the executable in the argument list:

ENTRYPOINT exec executable arg1 arg2 ...

CMD

https://docs.docker.com/engine/reference/builder/#cmd

The CMD instructions is intended to provide defaults for the executing container. The executable should be specified with ENTRYPOINT, and the default arguments with CMD. When ENTRYPOINT is missing, the executable can also be specified as part of the CMD instruction, as the first argument.

The CMD instruction has three forms: the "exec" form, the "default parameters to ENTRYPOINT" form and the "shell" form.

CMD exec Form

The "exec" form involves specifying the CMD arguments in double-quoted (") JSON array format:

CMD ["executable", "arg1", "arg2"]

The first argument should be an executable - if it is not, see "Default Parameters to ENTRYPOINT" below. The executable and the arguments are executed without any shell intervention. In consequence, shell pre-processing of the arguments does not happen, so arguments like "$HOME" are not expanded to the values available in the container environment. The executable passed as first argument gets PID 1, and signals are being sent directly to it.

If docker run is invoked with extra command line arguments, the execution will fail:

docker run <image> some-arg
docker: Error response from daemon: OCI runtime create failed: 
             container_linux.go:296: starting container process caused "exec: \"some-arg\": 
             executable file not found in $PATH": unknown.
ERRO[0001] error waiting for container: context canceled

When CMD in "exec" format is used, the executable and arguments will propagate verbatim in the container metadata.

"Cmd": [
    "executable",
    "arg1",
    "arg2"
]

CMD Default Parameters to ENTRYPOINT Form

In this form, both CMD and ENTRYPOINT must be specified as JSON arrays:

ENTRYPOINT [ "ls" ]
CMD ["-a", "-l"]

In this situation, the CMD-specified arguments are overridden by command line arguments, if any. For the above configuration, executing the container without any argument will end up in the execution of:

ls -a -l

and running the container with "-d" with end up in:

ls -d

CMD shell Form

The "shell" form involves specifying arguments as unquoted strings:

CMD ls $HOME

The container will execute the CMD arguments with "/bin/sh -c", and this is obvious when inspecting the container metadata:

"Cmd": [
    "/bin/sh",
    "-c",
    "ls $HOME"
]

When using the "shell" form, the arguments are pre-processed by the shell, so arguments like $HOME are expanded with values present in the container's environment.

ENTRYPOINT and CMD Interaction

Dockerfile should specify at least one of CMD or ENTRYPOINT commands. If there isn't one, the executable launched when the container is run is specified by the last ancestor parent that has a ENTRYPOINT or a CMD specification - the value is inherited. If none of the ancestors specify an ENTRYPOINT/CMD and the current image does not specify one either, the Docker runtime produces this error message:

docker run <image>
docker: Error response from daemon: No command specified.

Conventionally, ENTRYPOINT in "exec" format should be used to set the executable and the default options, and CMD in "exec" format should be used to set the command-line overrridable options:

ENTRYPOINT [ "ls", "-a" ]
CMD ["-l"]

If there are no command-line overrides, ENTRYPOINT and CMD are combined. If there are command-line overrides, CMD options are dropped. When ENTRYPOINT and CMD are used together it's important that the "exec" form is used with both instructions.


NO ENTRYPOINT ENTRYPOINT exec1 arg1 ENTRYPOINT ["exec1", "arg1"]
NO CMD inherited or error /bin/sh -c exec1 arg1 exec1 arg1
CMD ["exec2", "arg2"] exec2 arg2 /bin/sh -c exec1 arg1 exec1 arg1 exec2 arg2
CMD exec2 exec2 /bin/sh -c exec2 arg2 /bin/sh -c exec1 arg1 exec1 arg1 /bin/sh -c exec2 arg2

USER

The USER instructions sets the user name or UID and optionally the user group or GID to use when running the image and for any RUN, CMD and ENTRYPOINT instructions that follow in Dockerfile.

USER <user>[:group]
USER <UID>[:GID]

The command will result in the insertion of a "User" element in the Config.User container metadata element:

[
  {
    "Config": {
      "User": "<UID>:<GID>"
    }
  }
]

which will instruct the container runtime to execute the root process with the specified UID and GID. The UID and GID specified as such can be overridden in the docker run command as follows:

docker run ... -u|--user <username|uid>[:<group|gid>]

If not specified, processes run as user "root" (UID 0). When the user doesn’t have a primary group then the image (or the next instructions) will be run with the root group.

When specifying a group for the user, the user will have only the specified group membership. Any other configured group memberships will be ignored.

The value of the USER directive is alphanumeric, representing a user name, the name must be resolvable in the container, otherwise the container won't start:

docker: Error response from daemon: linux spec user: unable to find user blah: no matching entries in passwd file.

The constraint does not apply to UIDs, a container will start with any UID.

Also see:

Docker Security

ENV

Declares environment variable accessible to the processes in the container:

ENV SOMETHING "something else"

Environment variables declared this way can be used, and will be expanded in RUN commands in shell form.

The default values declared in the Dockerfile for a specific environment variable can be overridden on command line with:

docker run -e SOMETHING="command line value"

Environment variables declared with the ENV statement can also be used in certain instructions as variables to be interpreted by the Dockerfile. Escapes are also handled for including variable-like syntax into a statement literally. The syntax that introduces variables in the Dockerfile is

${SOME_VAR}

The instructions that can use variables are:

For more details see:

https://docs.docker.com/engine/reference/builder/#environment-replacement

RUN

https://docs.docker.com/engine/reference/builder/#run

The RUN instruction will execute any command in a new layer on top of the current image, and commit results. The resulted committed image will be used for the next step in the Dockerfile. Layering RUN instructions and generating commits conforms to the core concepts of Docker where commits are cheap and containers can be created from any point in an image’s history, much like source control. However, in practice, it is a good idea to reduce the number of RUN commands - and layers they generate.

It has two forms:

RUN Shell Form

RUN <command>

The form performs shell command line expansion, so environment variables will be expanded to their values and used as such. Environment variables previously declared with ENV in the same Dockerfile will also be expanded. All shell command line expansion will take place. In shell form backslashes (\) can be used to continue a single RUN instruction onto the next line:example, consider these two lines:

RUN echo "something"; \
echo "something else"

A recommended pattern is to concatenate as many commands possible on a single logical line and separated them by '&&', as shown:

RUN command1 && \
    command2 && \
#
# Multi-line comments are allowed
#
    command3 && \
#
# Another multi-line comment
#
   command4

RUN Exec Form

RUN ["executable", "param1", "param2", ...]

The exec from makes possible to run executables without relying on the existence of a shell binary in the image. It also does not perform shell command line expansion.

Running commands like yum in the Dockerfile is discouraged because it increases the time it takes for the build to finish. The alternative is to use base images that already have these updates applied.

WORKDIR

https://docs.docker.com/engine/reference/builder/#workdir

Changes the working directory within the context of the image being built, for the rest of the build instructions. If the WORKDIR doesn’t exist, it will be created. If a relative path is provided, it will be relative to the path of the previous WORKDIR instruction.

The WORKDIR instruction can resolve environment variables previously set using ENV.

...
ENV SOMEDIR /some/dir
WORKDIR $SOMEDIR

MAINTAINER

It sets the "Author" field in the image metadata. It deprecated, use LABEL instead.

LABEL

Applies a label to the image.

LABEL "something"="something else" "other label"="some other content"

Recommended Format for Description Information

LABEL description="This text illustrates \
that label-values can span multiple lines."

VOLUME

https://docs.docker.com/engine/reference/builder/#volume

The VOLUME instruction creates a mount point inside the container and marks it as holding an externally mounted data volume from the native host. If there is no such directory in the image, it will be created and left empty. The Docker documentation says that the docker run command initializes the newly created volume with any data that exists at the specified location in the base image. However, this only happens if no external volume is mounted. If an external volume is mounted, the content of the external volume will prevail. docker run must provide the Docker host-level directory and associate it with the mount point. At runtime, the storage driver will be bypassed when written data into the volume, so the I/O will be performed at native speeds.

VOLUME <absolute-path-inside-container>
VOLUME [ "<absolute-path-inside-container>" ]

The native host directory cannot be declared in Dockerfile: it is by its nature host-dependent and it presence cannot be guaranteed, so to preserve portability, the native host mount point must be specified when creating the container with docker run --mount or docker run -v. The actual location of the volume on the native host is a directory whose path is returned by the corresponding "Source" entry in output of:

docker inspect -f '{{json .Mounts}}' <container-id>

EXPOSE

https://docs.docker.com/engine/reference/builder/#expose

The 'EXPOSE' instruction serves as a hint (documentation) of the fact that the container listens on the specified ports at runtime, and those ports may be published. The instruction does not actually publish the port. Publishing is done with -p flag or -P flags in the docker run command.