Docker Concepts: Difference between revisions
Line 35: | Line 35: | ||
=Container= | =Container= | ||
A ''container'' is a mechanism for isolating running processes. | |||
A Docker container is a Linux container that has been instantiated from a [[#Container_Image|Docker image]]. | A Docker container is a Linux container that has been instantiated from a [[#Container_Image|Docker image]]. |
Revision as of 20:34, 5 May 2017
Internal
Overview
Docker is at the same time a packaging format, a set of tools with server and client components, and a development and operations workflow. Because it defines a workflow, Docker can be seen as a tool that reduces the complexity of communication between the development and the operations teams.
Docker architecture centers on atomic and throwaway containers. During the deployment of a new version of an application, the whole runtime environment of the old version of the application is thrown away with it, including dependencies, configuration, all the way to, but excluding the O/S kernel. This means the new version of the application won't accidentally use artifacts left by the previous release, and the ephemeral debugging changes are not going to survive. This approach also makes the application portable between servers, which act as places where to dock standardized containers.
A Docker release artifact is a single file, whose format is standardized. It consists of a set of layered images.
The ideal Docker application use cases are stateless applications or applications that externalize their state in databases or caches: web frontends, backend APIs and short running tasks.
Docker Components
Docker Engine
Docker Engine is a portable runtime and packaging tool.
Docker Hub
Docker Hub is a cloud service for sharing application and automating workflows.
Docker Workflow
A Docker workflow represent the sequence of operations required to develop, test and deploy an application in production using Docker.
The Docker workflow largely consists in the following sequence:
- Developers build and test a Docker image and ship it to the registry.
- Operations engineers provide configuration details and provision resources.
- Developers trigger the deployment.
Container
A container is a mechanism for isolating running processes.
A Docker container is a Linux container that has been instantiated from a Docker image.
Docker containers are content and infrastructure agnostic: they can handle any kind of content and are not tied to any particular infrastructure.
Physically, a container is a reference to a layered filesystem image and some metadata about configuration (environment variables, for example). A running container has a name and a tag. The tag is generally used to identify a particular release of an image.
A specific container can only exist once.
It is considered best practice to run a single process within a container - the idea is that a container should provide a single function, which makes easy to scale it horizontally.
Restart Policy
The container's restart policy specifies how container are restarted when the docker server is restarted. It is maintained in /var/lib/docker/containers/<container-id>/hostconfig.json as:
{ ... "RestartPolicy":{"Name":"no","MaximumRetryCount":0}, ... }
Note that the json file must not be edited directly. If the restart policy has to be changed, that must be done with docker update.
The policy for a specific container can be retrieved with docker inspect.
Also see
Images
Container Image
A container image is a read-only template consisting of one or more filesystem layers and metadata. Together they represent all the files required to run an application - the container image encapsulates all the dependencies of an application and configuration, and it can be deployed on any environment that has support for running containers. The same bundle can be assembled, tested and shipped to production without any change. The image is produced by the build command, as the sole artifact of the build process. When an image needs to be rebuilt, every single layer after the first introduced change will need to be rebuilt. Each image has a tag.
Layered Image
Each set of new changes made during the container build process is laid in top of previous changes. In general, each individual build step used to create the image corresponds to one filesystem layer. Each layer is identified by an unique hash. If a shared image is updated, all containers that use it must be re-created.
The layers are version controlled.
Base Image
Used by Dockerfile FROM.
Image Registry
- Docker Registry https://docs.docker.com/registry/
A Docker registry is a service that is storing Docker images and metadata about those images. Examples:
Image Repository
A Docker repository is a collection of different Docker images with same name, that have different tags.
Tag
Tag is alphanumeric identifier of the images within a repository. It is a form of Docker revision control.
Union Filesystem
Docker uses a union filesystem to combine all layers within an image into a single coherent filesystem.
Labels
Labels represent metadata in the form of key/value pairs, and they can be specified with the Dockerfile LABEL command. Labels can be applied to containers and images and they are useful in identifying and searching Docker images and containers. Labels applied to an image can be retrieved with docker inspect command.
Dockerfile
A Dockerfile defines how a container should look at build time, and it contains all the steps that are required to create the image. Each command in the Dockerfile generates a new layer in the image. The Dockerfile is an argument (possibly implicit, if present in the directory the command is run from) of the build command. For more details, see:
.dockerignore
Contains files and directories that won't be uploaded to the docker host when the image is built.
Docker and Virtualization
Containers implement virtualization above the O/S kernel level.
In case of O/S virtualization, a virtual machine contains a complete guest operating system and runs its own kernel, in top of the host operating system. The hypervisor that manages the VMs and the VMs use a percentage of the system's hardware resources, which are no longer available to the applications.
A container is just another process, with a lightweight wrapper around it, that interacts directly with the Linux kernel, and can utilize more resources that otherwise would have gone to hypervisor and the VM kernel. The container includes only the application and its dependencies. It runs as an isolated process in user space, on the host's operating system. The host and all containers share the same kernel.
A virtual machine is long lived in nature. Containers have usually shorter life spans.
The isolation among containers is much more limited than the isolation among virtual machines. A virtual machine has default hard limits on hardware resources it can use. Unless configured otherwise, by placing explicit limits on resources containers can use, they compete for resources.
Docker and State Management
Environment Variables
Containerized applications must avoid maintaining configuration in filesystem files - if they do, it limits the reusability of the container. A common pattern used to handle application configuration is to move configuration state into environment variables that can be passed to the application from the container. Docker supports environment variables natively, they are stored in the metadata that makes up a container configuration, and restarting the container will ensure the same configuration is passed to the application each time.
Storing Files
Storing state into the container's filesystem will not perform well. The space is extremely limited and the state will not be preserved across the container lifecycle.
Application State
The best use case for Docker is an application that can store state in a centralized location that could be accessed regardless of which host the container runs on.
Docker Revision Control
Docker provides two forms of revision control:
Cloud Platform
Docker is not a cloud platorm. It only handles containers on pre-existing Docker hosts. It does not allow to create new hosts, object stores, block storage, and other resources that can be provisioned dynamically by a cloud platform.
Security
Production containers should almost always be run under the context of a non-privileged user. See Dockerfile USER.
Dependencies
The Docker workflow allows all dependencies to be discovered during the development and test cycles.
The Docker Client
The Docker client is an executable used to control most of the Docker workflow and communicate with remote servers. The Docker client runs directly on most major operating systems. The same Go executable acts as both client and server, depending on how it is invoked. The client uses the Remote API to communicate with the server.
The Docker Server
The Docker server (also referred as the Docker daemon) is a process that runs as a daemon and manages the containers, and the client tells the server what to do. The server uses Linux containers and the underlying Linux kernel mechanisms (cgroups, namespaces, iptables, etc.), so it can only run on Linux servers. The same Go executable acts as both client and server, depending on how it is invoked, and it will launch as server only on supported Linux hosts. Each Docker host will normally have one Docker daemon that can manage a number of containers.
The server can talk directly to the image registries when instructed by the client.
The server listens on 2375 for non-encrypted traffic and 2376 for encrypted traffic.
Client/Server Communication
The client and server communicate over network (TCP or Unix) sockets.
Remote API
cgroups
Namespaces
Container Networking
A Docker container behaves like a host on a private network. Each container has its own virtual Ethernet interface and its own IP address. All containers managed by the same server are on a default virtual network together and can talk to each other directly. In order to get to the host and the outside world, the traffic from the containers goes over an interface called docker0: the Docker server acts as a virtual bridge for outbound traffic. The Docker server also allows containers to "bind" to ports on the host, so outside traffic can reach them: the traffic passes over a proxy that is part of the Docker server before getting to containers.
The default mode can be changed, for example --net configures the server to allow containers to use the host's own network device and address.
Docker Projects
Boot2Docker
It is deprecated.
Docker Machine
"Docker Up and Running" Page 54.
Docker Compose
Docker Swarm
Atomic Host
An atomic host is a small, finely tuned operating system image like https://coreos.com or http://www.projectatomic.io, that supports container hosting and atomic OS upgrades.
Backends
Storage Backend
The docker server's storage backend communicates with the underlying Linux filesystem to build and manage the multiple image layers that combine into a single image. The storage backend provides a fast CoW (copy-on-write) system for image management.
Backends:
Loopback Storage
The default loopback storage, while appropriate for proof of concept environments, it isn't for production.