OpenShift Container Probes: Difference between revisions
(50 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=<span id='Pod_Probe'></span>Container Probe= | =<span id='Pod_Probe'></span>Container Probe= | ||
Users can configure ''container probes'' for liveness or readiness. Sometimes they are referred as "pod probes", but they are configured at container-level, not pod-level. Each container can have its own probe set, which are exercised, and return results, independently. They are specified in the [[OpenShift Pod Definition#example_containers|pod template]]. | Users can configure ''container probes'' for liveness or readiness. Sometimes they are referred as "pod probes", but they are configured at container-level, not pod-level. Each container can have its own probe set, which are exercised, and return results, independently. They are specified in the [[OpenShift Pod Definition#example_containers|pod template]]. | ||
Line 187: | Line 8: | ||
==Liveness Probe== | ==Liveness Probe== | ||
A ''liveness probe'' indicates whether the container is running. If the liveness probe fails, Kubernetes kills the container, and the container is subjected to its restart policy | A ''liveness probe'' indicates whether the container is running. If the liveness probe fails, Kubernetes kills the container, and the container is subjected to its [[#Container_Restart_Policy|restart policy]], as described in [[OpenShift_Pod_Concepts#Liveness_Probe_Failure|Liveness Probe Failure]]. If a container does not provide a liveness probe, the liveness diagnostic is considered successful by default. | ||
The following sequence should go in the [[OpenShift Pod Definition#example_containers|container declaration from the pod template]], at the same level as "name": | The following sequence should go in the [[OpenShift Pod Definition#example_containers|container declaration from the pod template]], at the same level as "name": | ||
Line 201: | Line 22: | ||
tcpSocket: | tcpSocket: | ||
port: 5432 | port: 5432 | ||
==Readiness Probe== | ==Readiness Probe== | ||
A ''readiness probe'' is deployed in a container to expose whether the container is ready to service requests. If a container does not provide a readiness probe, the readiness state after creation is by default "Success". | A ''readiness probe'' is deployed in a container to expose whether the container is ready to service requests. If a container does not provide a readiness probe, the readiness state after creation is by default "Success". On readiness probe failure, Kubernetes will stop sending traffic into that specific pod, by removing the corresponding endpoint form the service, as described in the [[OpenShift_Pod_Concepts#Readiness_Probe_Failure|readiness probe failure]] section. <Font color=red>What about router?</font>. A readiness probe is useful when we want to automatically stop sending traffic if a pod enters an unstable state, and resume sending traffic into it if, and when it recovers. This could also be used in implementing a mechanism to allow taking the container down for maintenance. Note that if you just want to be able to drain requests when the pod is deleted, you do not necessarily need a readiness probe; on deletion, the pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The pod remains in the unready state while it waits for the containers in the pod to stop. | ||
The following sequence should go in the [[OpenShift Pod Definition#example_containers|container declaration from the pod template]], at the same level as "name": | The following sequence should go in the [[OpenShift Pod Definition#example_containers|container declaration from the pod template]], at the same level as "name": | ||
Line 230: | Line 48: | ||
<span id='readiness_probe_work'></span>After the container is started, Kubernetes waits for [[#readinessProbe_initialDelaySeconds|initialDelaySeconds]], specified in seconds, then it triggers the execution of the probe specified by "[[#readinessProbe_exec|exec]]", "httpGet", "tcpSocket", etc. Once the probe execution is started, Kubernetes waits for [[#readinessProbe_timeoutSeconds|timeoutSeconds]] (default 1 second) for the probe execution to complete. | <span id='readiness_probe_work'></span>After the container is started, Kubernetes waits for [[#readinessProbe_initialDelaySeconds|initialDelaySeconds]], specified in seconds, then it triggers the execution of the probe specified by "[[#readinessProbe_exec|exec]]", "httpGet", "tcpSocket", etc. Once the probe execution is started, Kubernetes waits for [[#readinessProbe_timeoutSeconds|timeoutSeconds]] (default 1 second) for the probe execution to complete. | ||
If the probe execution is successful, the success counts towards the [[#successThreshold|successThreshold_initialDelaySeconds]]. A total number of successful execution specified in | If the probe execution is successful, the success counts towards the [[#successThreshold|successThreshold_initialDelaySeconds]]. A total number of consecutive successful execution specified in [[#readinessProbe_successThreshold|successThreshold]] must be counted ''after a failure'', for the container to be considered as passing the probe. For liveness probes, this value must be 1. The default value is 1. | ||
If the probe does not complete within | If the probe does not complete within [[#readinessProbe_timeoutSeconds|timeoutSeconds]] seconds or it explicitly fails, the failure counts towards the [[#readinessProbe_failureThreshold|failureThreshold]]. A total number of ''successive'' failed execution specified in [[#readinessProbe_failureThreshold|failureThreshold]] must be counted before the container to be considered as failing the probe. | ||
The probe is executed periodically with a periodicity of [[#readinessProbe_periodSeconds|periodSeconds]]. | The probe is executed periodically with a periodicity of [[#readinessProbe_periodSeconds|periodSeconds]]. | ||
Line 238: | Line 56: | ||
===Liveness Probe Failure=== | ===Liveness Probe Failure=== | ||
If the liveness probe fails, Kubernetes kills the container and the container is subjected to its [[#Container_Restart_Policy|restart policy]]. | If the liveness probe fails, Kubernetes kills the container and the container is subjected to its [[#Container_Restart_Policy|restart policy]]. A liveness probe that fails occasionally is indicated by the number of restarts: | ||
NAME READY STATUS '''RESTARTS''' AGE | |||
rest-service-1-9p9hj 1/1 Running '''3''' 1m | |||
Note that a pod will maintain its name after a restart. | |||
If the liveness probe fails consistently, the pod enters a crash loop backoff state <font color=red>What is exactly the condition that makes it go from "Running" to "CrashLoopBackOff"?</font>: | |||
NAME READY '''STATUS''' RESTARTS AGE | |||
rest-service-1-9p9hj 0/1 '''CrashLoopBackOff''' 5 3m | |||
===Readiness Probe Failure=== | ===Readiness Probe Failure=== | ||
If the readiness probe fails, the [[OpenShift_Service_Concepts#EndpointsController|EndpointsController]] removes the Pod’s IP address from the endpoints of all Services that match the Pod. | If the readiness probe fails, the [[OpenShift_Service_Concepts#EndpointsController|EndpointsController]] removes the Pod’s IP address from the endpoints of all Services that match the Pod. The service will still exist, but it'll list less endpoints. If the service is backed by one-replica pod, it'll have zero endpoints. | ||
The container will still show in a [[#Running|Running]] [[#phase|phase]] (status), but it will not be "READY". | The container will still show in a [[#Running|Running]] [[#phase|phase]] (status), but it will not be "READY". | ||
Line 249: | Line 77: | ||
po/rest-service-3-bm1t9 '''0/1''' Running 0 2m | po/rest-service-3-bm1t9 '''0/1''' Running 0 2m | ||
Note that if the pod "heals" - the readiness probe starts passing after the configured number of successful run reaches successThreshold, the endpoint is re-attached to the service, automatically. | |||
Latest revision as of 00:39, 1 November 2019
Container Probe
Users can configure container probes for liveness or readiness. Sometimes they are referred as "pod probes", but they are configured at container-level, not pod-level. Each container can have its own probe set, which are exercised, and return results, independently. They are specified in the pod template.
A probe is executed periodically by Kubernetes, and consists in a diagnostic on the container, which may have one of the following results: Success, which means the container passed the diagnostic, Failure, meaning that the container failed the diagnostic and Unknown, which means the diagnostic execution itself failed and no action should be taken.
Liveness Probe
A liveness probe indicates whether the container is running. If the liveness probe fails, Kubernetes kills the container, and the container is subjected to its restart policy, as described in Liveness Probe Failure. If a container does not provide a liveness probe, the liveness diagnostic is considered successful by default.
The following sequence should go in the container declaration from the pod template, at the same level as "name":
livenessProbe: initialDelaySeconds: 30 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 3 periodSeconds: 10 tcpSocket: port: 5432
Readiness Probe
A readiness probe is deployed in a container to expose whether the container is ready to service requests. If a container does not provide a readiness probe, the readiness state after creation is by default "Success". On readiness probe failure, Kubernetes will stop sending traffic into that specific pod, by removing the corresponding endpoint form the service, as described in the readiness probe failure section. What about router?. A readiness probe is useful when we want to automatically stop sending traffic if a pod enters an unstable state, and resume sending traffic into it if, and when it recovers. This could also be used in implementing a mechanism to allow taking the container down for maintenance. Note that if you just want to be able to drain requests when the pod is deleted, you do not necessarily need a readiness probe; on deletion, the pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The pod remains in the unready state while it waits for the containers in the pod to stop.
The following sequence should go in the container declaration from the pod template, at the same level as "name":
readinessProbe: initialDelaySeconds: 5 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 3 periodSeconds: 10 exec: command: - /bin/sh - -i - -c - psql -h 127.0.0.1 -U $POSTGRESQL_USER -q -d $POSTGRESQL_DATABASE -c 'SELECT 1'
Probe Operations
After the container is started, Kubernetes waits for initialDelaySeconds, specified in seconds, then it triggers the execution of the probe specified by "exec", "httpGet", "tcpSocket", etc. Once the probe execution is started, Kubernetes waits for timeoutSeconds (default 1 second) for the probe execution to complete.
If the probe execution is successful, the success counts towards the successThreshold_initialDelaySeconds. A total number of consecutive successful execution specified in successThreshold must be counted after a failure, for the container to be considered as passing the probe. For liveness probes, this value must be 1. The default value is 1.
If the probe does not complete within timeoutSeconds seconds or it explicitly fails, the failure counts towards the failureThreshold. A total number of successive failed execution specified in failureThreshold must be counted before the container to be considered as failing the probe.
The probe is executed periodically with a periodicity of periodSeconds.
Liveness Probe Failure
If the liveness probe fails, Kubernetes kills the container and the container is subjected to its restart policy. A liveness probe that fails occasionally is indicated by the number of restarts:
NAME READY STATUS RESTARTS AGE rest-service-1-9p9hj 1/1 Running 3 1m
Note that a pod will maintain its name after a restart.
If the liveness probe fails consistently, the pod enters a crash loop backoff state What is exactly the condition that makes it go from "Running" to "CrashLoopBackOff"?:
NAME READY STATUS RESTARTS AGE rest-service-1-9p9hj 0/1 CrashLoopBackOff 5 3m
Readiness Probe Failure
If the readiness probe fails, the EndpointsController removes the Pod’s IP address from the endpoints of all Services that match the Pod. The service will still exist, but it'll list less endpoints. If the service is backed by one-replica pod, it'll have zero endpoints.
The container will still show in a Running phase (status), but it will not be "READY".
NAME READY STATUS RESTARTS AGE po/rest-service-3-bm1t9 0/1 Running 0 2m
Note that if the pod "heals" - the readiness probe starts passing after the configured number of successful run reaches successThreshold, the endpoint is re-attached to the service, automatically.