Kubernetes Concepts

From NovaOrdis Knowledge Base
Jump to: navigation, search




Kubernetes is an Apache 2.0 Open Source container orchestration platform, or a container orchestrator.

A high level abstraction often used for Kubernetes is "data center OS". The primary use case for Kubernetes consists of containerized cloud-native applications, which are applications that are made from a set of small autonomous services (microservices) that communicate with each other. Kubernetes helps deploying, scaling up, scaling down, performing updates and rollbacks of these services, materialized as a set of containers. It does that at scale. In the process, abstracting out details such as what specific compute nodes or physical storage volumes are allocated to applications.

Kubernetes instances are known as clusters. All management interactions with a Kubernetes cluster are performed by sending REST requests into an API Server. The API Server is responsible with managing and exposing the state of the cluster. The state of the cluster is internally stored by a cluster store control plane system service, which is currently implemented by etcd. The control loops essential to the declarative model implemented by Kubernetes are driven by various specialized controllers under the supervision of the controller manager. The workloads are dispatched by the scheduler. All these components - the API Server, cluster store, controllers, scheduler, cloud controller manager - are collectively known as the control plane and are executed on master nodes. Externally, the state can be accessed and modified with specialized tools, of which the most common is a command line client named kubectl. The control plane, the API server and API server-related details, such as admission controllers, are discussed in Control Plane and Data Plane Concepts.

Application workloads are deployed as pods on a set of worker nodes that constitute the data plane. Each node runs a container runtime, usually Docker. However, support for other container runtimes is available, via Container Runtime Interface (CRI). Container runtime details are discussed in Container Runtime Concepts.

Worker nodes are used to run workloads, deployed as pods. Pods are scheduled to nodes and then they are closely monitored. A pod is a wrapper that allows one or more containers to run on Kubernetes and it is the atomic unit of deployment in Kubernetes. Pods come and go - if a pod dies, it is not resurrected, but its failure is detected by the lack of response from configured probes that test expected container behavior and, depending on configuration, another pod may be scheduled as replacement. In consequence, the IP address of an individual pod cannot be relied on. Pod, containers and probes are discussed in Pod and Container Concepts.

Because the pods and their IPs are ephemeral, Kubernetes introduces an additional mechanism aimed at providing stable access point to a set of equivalent pods that belong to the same application: a service. A service can be thought of stable networking access for a continuously changing set of pods. A service's IP address and port can be relied on to not change for the life of the service. All live pods represented by a service at a moment in time are known as service "endpoints" - in fact, there is a Kubernetes resource representing a live pod within the context of a service, and it is called endpoint. There are several types of services: ClusterIP, NodePort and LoadBalancer. The association between services and pods is loose - it is established logically by the service's selector, which is a label-based mechanism: a pod "belongs" to a service if the service's selector matches the pod's labels. Services are explained at length in the Service Concepts section and selectors in Selector Concepts. A layer 7 complement to services, named Ingress, is available. Ingresses are discussed in Ingress Concepts.

A pod by itself has no built-in resilience: if it fails for any reason, it is gone. A higher level primitive - the deployment - is used to manage a set of pods from a high availability perspective: the deployment insures that a specific number of equivalent pods is always running, and if one of more pods fail, the deployment brings up replacement pods. The deployment relies on an intermediary concept - the ReplicaSet. Deployments are used to implement rolling updates and rollbacks. There are higher-level pod controllers that manage sets of pods in different ways: DaemonSets and StatefulSets. Individual pods can be managed as Jobs or CronJobs. The pod controllers are discussed in Kubernetes Workload Resources.

Most Kubernetes resources can be logically grouped in namespaces. Services, pods, secrets, any many others are all namespaced. New namespaces can be created administratively, and all Kubernetes clusters come with a "default" namespace. There are resource types that cannot be allocated to namespaces, and those are named "cluster-level" resources. More details about namespaces are available in Namespace Concepts.

All Kubernetes resources can be annotated with labels and annotations, which are aimed at facilitating creating loose associations between resources. For more details see Kubernetes Labels and Annotations.

A Kubernetes cluster exposes external storage to pods with three API resources: PersistentVolumes, PersistentVolumeClaims and StorageClasses, which are part of the persistent volume subsystem. The actual storage is made available to Kubernetes by storage plugins, also known as provisioners, which should abide by the Container Storage Interface (CSI). All these are explained at length in Storage Concepts.

Every pod in the Kubernetes cluster has its own IP address, which is directly routable to every other pod. The stable IP addresses provided by services are resolvable by the cluster's internal DNS service, as described in the DNS Concepts. This and other networking-related aspects are explained in Networking Concepts.

Configuration can be exposed to pods specialized resources such as ConfigMaps and Secrets. Kubernetes is exposing pod information to the containers running inside the pod through files, which are projected in the container by a mechanism known as the Downward API. More details about configuration are available in Configuration Concepts.

Kubernetes security system ensures that the API sever is only accessed by authenticated identities, and the access is limited to resources that are supposed to be accessible to the authenticated identity, and also ensuring that the applications running in containers only access the node and network resources that are supposed to access, and nothing more. These aspects are discussed in Security Concepts.

Resources managed by Kubernetes are subject to policies: Limit Ranges, Resource Quotas and specifically for pods, Pod Security Policies. More details on resource management are available in Resource Management Concepts.

Application health monitoring, resource consumption monitoring and scaling decisions require metrics to be collected and analyzed. Kubernetes facilitates metrics collection from containers, pods, services and other resources via metric pipelines. Metrics and metric pipelines are discussed in Metrics in Kubernetes.

Pods can be automatically scaled up and down based on interpretation of their performance characteristic or resource consumption. Kubernetes provides a built-in autoscaling mechanism. For more details see Autoscaling Concepts.

Kubernetes has built-in extension capabilities, allowing for custom resources and registering multiple APIs servers via the aggregation layer. A specific kind of extension are the operators. More details are available in Extending Kubernetes.

Spec and status: inputs and outputs.

Declarative versus Imperative Approach

The preferred style while operating Kubernetes is to use a declarative model: Kubernetes likes to manage its resources declaratively, where we describe how we want our application to look - the desired state - in a set of YAML files, named manifests, POST these files into the Kubernetes API Server with kubectl apply or other tools and wait for the changes to be applied. The controller manager and specialized controllers check whether the current state matches the desired state, and if the states do not match, they act to reconcile them, which usually happens after a short delay. This pattern is referred to as control loop. A "control loop" is a design pattern for distributed software that allows to define state declaratively and employ a controller to bring the current state to the desired state. It typically obtains the desired state, repeatedly observes the current state, determines differences and, if differences exist, reconciles differences. The terms "control loop" are used interchangeably with "watch loop" and "reconciliation loop".

Step-by-step, the declarative model works as follows:

  1. The desired state of the application is declared in the manifest file.
  2. The manifest file is POSTed into the API Server, usually with kubectl command.
  3. The API server authenticates and authorizes the request and then validates the manifest.
  4. The API server stores the state - as desired state - in the cluster store.
  5. The API server identifies the controller responsible with enforcing and monitoring the state.
  6. The controller in charge implements the desired state, by adjusting the current state to match it.
  7. The controller manager's control loops monitor the current state and make sure that it does not diverge from the desired state. The current state of a resource can be obtained from the cluster with CLI commands.
  8. If the current state of the cluster diverges from the desired state, the cluster control plane will perform whatever tasks are necessary to bring those states in sync.

This model is the opposite of the traditional imperative model, where a precise sequence of specific commands are issued to explicitly adjust the state. In other words, in the declarative model we tell the cluster how things should look, as opposite to telling it how to adjust the state.

Also see:
Declarative Infrastructure Languages


Kubernetes Flavors