Kubernetes Container Probes: Difference between revisions
Line 57: | Line 57: | ||
=<span id='Container_Readiness_Check'></span>Container and Pod Readiness Check= | =<span id='Container_Readiness_Check'></span>Container and Pod Readiness Check= | ||
Containers IP address and port pairs are added to a [[Kubernetes_Service_Concepts#Service_.28ClusterIP_Service.29|service]]'s [[Kubernetes_Service_Concepts#Endpoints|Endpoints]] list and forwarded traffic to if the service selector matches the pod labels '''and''' the pod is "ready", meaning that all pod's containers that expose ports with the service are ready. Usually | Containers IP address and port pairs are added to a [[Kubernetes_Service_Concepts#Service_.28ClusterIP_Service.29|service]]'s [[Kubernetes_Service_Concepts#Endpoints|Endpoints]] list and forwarded traffic to if the service selector matches the pod labels '''and''' the pod is "ready", meaning that all pod's containers that expose ports with the service are ready. Usually a pod exposes just one container, so "ready pod" and "ready container" are in this case equivalent. The situation when there are multiple exposed containers per pod is addressed in detail in the [[#Multiple_Containers_per_Pod|Multiple Containers per Pod]] section. | ||
Revision as of 05:41, 22 September 2020
External
- https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes
- https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
- https://medium.com/swlh/fantastic-probes-and-how-to-configure-them-fef7e030bd2f
Internal
TODO
- Merge and deplete OpenShift Container Probes.
Overview
A probe is a diagnostic performed periodically by the kubelet on a container. To perform the diagnostic, the kubelet calls a handler, that must be declared and implemented by the container. Each probe has one of these results:
- success - the container passed the diagnostic
- failure - the container failed the diagnostic
- unknown - the diagnostic itself filed so no action should be taken.
There are three kinds of probes: startup, liveness and readiness.
Handlers
A handler is a piece of logic declared and implemented by the container, which is used by Kubernetes control mechanism to draw conclusions about the state the container is in There are three types of handlers, described below. Any of these handlers can be used to perform startup, liveness and readiness checks.
ExecAction
The exec action (declared as "exec:") executes a specified command inside the container. The diagnostic is considered successful if the command exits with a status code of 0.
HTTPGetAction
Performs an HTTP GET request against the container’s IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 180
timeoutSeconds: 30
periodSeconds: 25
TCPSocketAction
Performs a TCP check against the container’s IP address on a specified port. The diagnostic is considered successful if the connection is successfully established.
Container and Pod Startup Check
The startup check is performed by a startup probe. Startup probes have been introduced in Kubernetes 1.16. The probe indicates whether the application within the container is started. If a startup probe is not provided, the default result is "success". If a startup probe is provided, all other probes are disabled until the startup probe succeeds. If the startup probe fails, the container is killed and it is subject to its restart policy. TODO: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-startup-probe
Container and Pod Liveness Check
The liveness check is performed by a liveness probe. The probe indicates whether the container is running. If a liveness probe is not provided, the default is "success". If a liveness probe is provided and it fails, the container will be killed and then subjected to its restart policy. (Not the pod? How about atomicity?)
Relationship between killing pods and containers - research needed. TODO: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-liveness-probe
Container and Pod Readiness Check
Containers IP address and port pairs are added to a service's Endpoints list and forwarded traffic to if the service selector matches the pod labels and the pod is "ready", meaning that all pod's containers that expose ports with the service are ready. Usually a pod exposes just one container, so "ready pod" and "ready container" are in this case equivalent. The situation when there are multiple exposed containers per pod is addressed in detail in the Multiple Containers per Pod section.
readiness probe executes successfully. Once a container endpoint is added to the Endpoints instance, the corresponding readiness probe is invoked periodically and the endpoint stays in the Endpoints list as long as the probe executes successfully. The endpoint is removed from the list on probe failure, but it can be added again if the probe starts succeeding again.
The notion of being "ready" is something that is specific to each container. For example, in the initialization phase of the pod, its traffic-serving container may need time to load either configuration or data, or it may need to perform a warm-up procedure to prevent the first user request from taking too long and affecting user experience. The readiness probe should be designed in such a way that it start succeeding only after initialization.
Container that serve load in production should always define a readiness probe.
Readiness probe should not be used for orderly taking pods out of load. That should be done by either deleting the pod, or defining a label "enabled=true" or similar that can be switched on or off.
Multiple Containers per Pod
If a pod defines multiple containers, each container defined in a pod can declare its own readiness probe. The pod is considered ready when all of its containers are ready. If at least one container is not ready, even if all others are ready, the pod will not count as "ready" and it will not be added to a service Endpoints.
Readiness Probe Operations
If the container does not provide a readiness probe, the default diagnostic result is "success".
If a probe is declared, the default state of readiness before the initial delay is "failure". The initial delay and probe timing arithmetic is explained in the Probe Template section.
The pod's readiness state is displayed in the output of kubectl get pod command:
NAME READY STATUS RESTARTS AGE
cassandra-0 0/1 Running 0 23s
If the readiness probe fails, the pod is removed from the Endpoints list. If the pod then becomes ready again, it is re-added.
Unlike a liveness probe, if a container fails the readiness check, it will not be killed or restarted.
Note that the container may put itself into a unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the containers in the pod to stop.
Manual Readiness Probe Example
readinessProbe:
exec:
command:
- ls
- /tmp/ready
initialDelaySeconds: 1
periodSeconds: 1
successThreshold: 1
failureThreshold: 1
timeoutSeconds: 1
Probe Template
The probe templates are sub-trees in the pod manifest.
kind: Pod
spec:
containers:
- name: ...
readinessProbe|livenessProbe:
exec:
Example:
readinessProbe|livenessProbe:
exec:
command:
- /bin/sh
- -c
- nodetool status | grep -E "^UN\s+${POD_IP}"
initialDelaySeconds: 90
periodSeconds: 30
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 5
Elements
Also see Readiness Probe Operations above.
initialDelaySeconds
Specifies the number of seconds after the container has started before the probe is executed for the first time. After the initial delay, the probe is invoked periodically, with a periodicity of periodSeconds seconds.
periodSeconds
How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. If the probe executes successfully, the next invocation will be executed in periodSeconds seconds.
timeoutSeconds
Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. What happens on timeout?
failureThreshold
Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.
successThreshold
Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.