Configure Liveness, Readiness and Startup Probes (2024)

This page shows how to configure liveness, readiness and startup probes for containers.

The kubelet usesliveness probes to know when to restart a container. For example, livenessprobes could catch a deadlock, where an application is running, but unable tomake progress. Restarting a container in such a state can help to make theapplication more available despite bugs.

A common pattern for liveness probes is to use the same low-cost HTTP endpointas for readiness probes, but with a higher failureThreshold. This ensures that the podis observed as not-ready for some period of time before it is hard killed.

The kubelet uses readiness probes to know when a container is ready to startaccepting traffic. A Pod is considered ready when all of its containers are ready.One use of this signal is to control which Pods are used as backends for Services.When a Pod is not ready, it is removed from Service load balancers.

The kubelet uses startup probes to know when a container application has started.If such a probe is configured, liveness and readiness probes do not start untilit succeeds, making sure those probes don't interfere with the application startup.This can be used to adopt liveness checks on slow starting containers, avoiding themgetting killed by the kubelet before they are up and running.

Caution: Liveness probes can be a powerful way to recover from application failures, butthey should be used with caution. Liveness probes must be configured carefullyto ensure that they truly indicate unrecoverable application failure, for example a deadlock.

Note: Incorrect implementation of liveness probes can lead to cascading failures. This results inrestarting of container under high load; failed client requests as your application became lessscalable; and increased workload on remaining pods due to some failed pods.Understand the difference between readiness and liveness probes and when to apply them for your app.

Before you begin

You need to have a Kubernetes cluster, and the kubectl command-line tool mustbe configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have acluster, you can create one by usingminikubeor you can use one of these Kubernetes playgrounds:

Define a liveness command

Many applications running for long periods of time eventually transition tobroken states, and cannot recover except by being restarted. Kubernetes providesliveness probes to detect and remedy such situations.

In this exercise, you create a Pod that runs a container based on theregistry.k8s.io/busybox image. Here is the configuration file for the Pod:

pods/probe/exec-liveness.yaml

apiVersion: v1kind: Podmetadata: labels: test: liveness name: liveness-execspec: containers: - name: liveness image: registry.k8s.io/busybox args: - /bin/sh - -c - touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5

In the configuration file, you can see that the Pod has a single Container.The periodSeconds field specifies that the kubelet should perform a livenessprobe every 5 seconds. The initialDelaySeconds field tells the kubelet that itshould wait 5 seconds before performing the first probe. To perform a probe, thekubelet executes the command cat /tmp/healthy in the target container. If thecommand succeeds, it returns 0, and the kubelet considers the container to be alive andhealthy. If the command returns a non-zero value, the kubelet kills the containerand restarts it.

When the container starts, it executes this command:

/bin/sh -c "touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600"

For the first 30 seconds of the container's life, there is a /tmp/healthy file.So during the first 30 seconds, the command cat /tmp/healthy returns a successcode. After 30 seconds, cat /tmp/healthy returns a failure code.

Create the Pod:

kubectl apply -f https://k8s.io/examples/pods/probe/exec-liveness.yaml

Within 30 seconds, view the Pod events:

kubectl describe pod liveness-exec

The output indicates that no liveness probes have failed yet:

Type Reason Age From Message---- ------ ---- ---- -------Normal Scheduled 11s default-scheduler Successfully assigned default/liveness-exec to node01Normal Pulling 9s kubelet, node01 Pulling image "registry.k8s.io/busybox"Normal Pulled 7s kubelet, node01 Successfully pulled image "registry.k8s.io/busybox"Normal Created 7s kubelet, node01 Created container livenessNormal Started 7s kubelet, node01 Started container liveness

After 35 seconds, view the Pod events again:

kubectl describe pod liveness-exec

At the bottom of the output, there are messages indicating that the livenessprobes have failed, and the failed containers have been killed and recreated.

Type Reason Age From Message---- ------ ---- ---- -------Normal Scheduled 57s default-scheduler Successfully assigned default/liveness-exec to node01Normal Pulling 55s kubelet, node01 Pulling image "registry.k8s.io/busybox"Normal Pulled 53s kubelet, node01 Successfully pulled image "registry.k8s.io/busybox"Normal Created 53s kubelet, node01 Created container livenessNormal Started 53s kubelet, node01 Started container livenessWarning Unhealthy 10s (x3 over 20s) kubelet, node01 Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directoryNormal Killing 10s kubelet, node01 Container liveness failed liveness probe, will be restarted

Wait another 30 seconds, and verify that the container has been restarted:

Define a liveness HTTP request

Another kind of liveness probe uses an HTTP GET request. Here is the configurationfile for a Pod that runs a container based on the registry.k8s.io/e2e-test-images/agnhost image.

pods/probe/http-liveness.yaml

apiVersion: v1kind: Podmetadata: labels: test: liveness name: liveness-httpspec: containers: - name: liveness image: registry.k8s.io/e2e-test-images/agnhost:2.40 args: - liveness livenessProbe: httpGet: path: /healthz port: 8080 httpHeaders: - name: Custom-Header value: Awesome initialDelaySeconds: 3 periodSeconds: 3

In the configuration file, you can see that the Pod has a single container.The periodSeconds field specifies that the kubelet should perform a livenessprobe every 3 seconds. The initialDelaySeconds field tells the kubelet that itshould wait 3 seconds before performing the first probe. To perform a probe, thekubelet sends an HTTP GET request to the server that is running in the containerand listening on port 8080. If the handler for the server's /healthz pathreturns a success code, the kubelet considers the container to be alive andhealthy. If the handler returns a failure code, the kubelet kills the containerand restarts it.

Any code greater than or equal to 200 and less than 400 indicates success. Anyother code indicates failure.

You can see the source code for the server inserver.go.

For the first 10 seconds that the container is alive, the /healthz handlerreturns a status of 200. After that, the handler returns a status of 500.

http.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { duration := time.Now().Sub(started) if duration.Seconds() > 10 { w.WriteHeader(500) w.Write([]byte(fmt.Sprintf("error: %v", duration.Seconds()))) } else { w.WriteHeader(200) w.Write([]byte("ok")) }})

The kubelet starts performing health checks 3 seconds after the container starts.So the first couple of health checks will succeed. But after 10 seconds, the healthchecks will fail, and the kubelet will kill and restart the container.

To try the HTTP liveness check, create a Pod:

kubectl apply -f https://k8s.io/examples/pods/probe/http-liveness.yaml

After 10 seconds, view Pod events to verify that liveness probes have failed andthe container has been restarted:

kubectl describe pod liveness-http

In releases after v1.13, local HTTP proxy environment variable settings do notaffect the HTTP liveness probe.

Define a TCP liveness probe

A third type of liveness probe uses a TCP socket. With this configuration, thekubelet will attempt to open a socket to your container on the specified port.If it can establish a connection, the container is considered healthy, if itcan't it is considered a failure.

pods/probe/tcp-liveness-readiness.yaml

apiVersion: v1kind: Podmetadata: name: goproxy labels: app: goproxyspec: containers: - name: goproxy image: registry.k8s.io/goproxy:0.1 ports: - containerPort: 8080 readinessProbe: tcpSocket: port: 8080 initialDelaySeconds: 15 periodSeconds: 10 livenessProbe: tcpSocket: port: 8080 initialDelaySeconds: 15 periodSeconds: 10

As you can see, configuration for a TCP check is quite similar to an HTTP check.This example uses both readiness and liveness probes. The kubelet will send thefirst readiness probe 15 seconds after the container starts. This will attempt toconnect to the goproxy container on port 8080. If the probe succeeds, the Podwill be marked as ready. The kubelet will continue to run this check every 10seconds.

In addition to the readiness probe, this configuration includes a liveness probe.The kubelet will run the first liveness probe 15 seconds after the containerstarts. Similar to the readiness probe, this will attempt to connect to thegoproxy container on port 8080. If the liveness probe fails, the containerwill be restarted.

To try the TCP liveness check, create a Pod:

Define a gRPC liveness probe

FEATURE STATE: Kubernetes v1.27 [stable]

If your application implements thegRPC Health Checking Protocol,this example shows how to configure Kubernetes to use it for application liveness checks.Similarly you can configure readiness and startup probes.

Here is an example manifest:

pods/probe/grpc-liveness.yaml

apiVersion: v1kind: Podmetadata: name: etcd-with-grpcspec: containers: - name: etcd image: registry.k8s.io/etcd:3.5.1-0 command: [ "/usr/local/bin/etcd", "--data-dir", "/var/lib/etcd", "--listen-client-urls", "http://0.0.0.0:2379", "--advertise-client-urls", "http://127.0.0.1:2379", "--log-level", "debug"] ports: - containerPort: 2379 livenessProbe: grpc: port: 2379 initialDelaySeconds: 10

To use a gRPC probe, port must be configured. If you want to distinguish probes of different typesand probes for different features you can use the service field.You can set service to the value liveness and make your gRPC Health Checking endpointrespond to this request differently than when you set service set to readiness.This lets you use the same endpoint for different kinds of container health checkrather than listening on two different ports.If you want to specify your own custom service name and also specify a probe type,the Kubernetes project recommends that you use a name that concatenatesthose. For example: myservice-liveness (using - as a separator).

Note: Unlike HTTP or TCP probes, you cannot specify the health check port by name, and youcannot configure a custom hostname.

Configuration problems (for example: incorrect port or service, unimplemented health checking protocol)are considered a probe failure, similar to HTTP and TCP probes.

To try the gRPC liveness check, create a Pod using the command below.In the example below, the etcd pod is configured to use gRPC liveness probe.

kubectl apply -f https://k8s.io/examples/pods/probe/grpc-liveness.yaml

After 15 seconds, view Pod events to verify that the liveness check has not failed:

kubectl describe pod etcd-with-grpc

When using a gRPC probe, there are some technical details to be aware of:

The probes run against the pod IP address or its hostname.Be sure to configure your gRPC endpoint to listen on the Pod's IP address.
The probes do not support any authentication parameters (like -tls).
There are no error codes for built-in probes. All errors are considered as probe failures.
If ExecProbeTimeout feature gate is set to false, grpc-health-probe does notrespect the timeoutSeconds setting (which defaults to 1s), while built-in probe would fail on timeout.

Use a named port

You can use a named portfor HTTP and TCP probes. gRPC probes do not support named ports.

For example:

ports:- name: liveness-port containerPort: 8080livenessProbe: httpGet: path: /healthz port: liveness-port

Protect slow starting containers with startup probes

Sometimes, you have to deal with legacy applications that might requirean additional startup time on their first initialization.In such cases, it can be tricky to set up liveness probe parameters withoutcompromising the fast response to deadlocks that motivated such a probe.The trick is to set up a startup probe with the same command, HTTP or TCPcheck, with a failureThreshold * periodSeconds long enough to cover theworst case startup time.

So, the previous example would become:

ports:- name: liveness-port containerPort: 8080livenessProbe: httpGet: path: /healthz port: liveness-port failureThreshold: 1 periodSeconds: 10startupProbe: httpGet: path: /healthz port: liveness-port failureThreshold: 30 periodSeconds: 10

Thanks to the startup probe, the application will have a maximum of 5 minutes(30 * 10 = 300s) to finish its startup.Once the startup probe has succeeded once, the liveness probe takes over toprovide a fast response to container deadlocks.If the startup probe never succeeds, the container is killed after 300s andsubject to the pod's restartPolicy.

Define readiness probes

Sometimes, applications are temporarily unable to serve traffic.For example, an application might need to load large data or configurationfiles during startup, or depend on external services after startup.In such cases, you don't want to kill the application,but you don't want to send it requests either. Kubernetes providesreadiness probes to detect and mitigate these situations. A pod with containersreporting that they are not ready does not receive traffic through KubernetesServices.

Note: Readiness probes runs on the container during its whole lifecycle.

Caution: The readiness and liveness probes do not depend on each other to succeed.If you want to wait before executing a readiness probe, you should useinitialDelaySeconds or a startupProbe.

Readiness probes are configured similarly to liveness probes. The only differenceis that you use the readinessProbe field instead of the livenessProbe field.

readinessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5

Configuration for HTTP and TCP readiness probes also remains identical toliveness probes.

Readiness and liveness probes can be used in parallel for the same container.Using both can ensure that traffic does not reach a container that is not readyfor it, and that containers are restarted when they fail.

Configure Probes

Probeshave a number of fields that you can use to more precisely control the behavior of startup,liveness and readiness checks:

initialDelaySeconds: Number of seconds after the container has started before startup,liveness or readiness probes are initiated. If a startup probe is defined, liveness andreadiness probe delays do not begin until the startup probe has succeeded. If the value ofperiodSeconds is greater than initialDelaySeconds then the initialDelaySeconds would beignored. Defaults to 0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds.The minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out.Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successfulafter having failed. Defaults to 1. Must be 1 for liveness and startup Probes.Minimum value is 1.
failureThreshold: After a probe fails failureThreshold times in a row, Kubernetesconsiders that the overall check has failed: the container is not ready/healthy/live.For the case of a startup or liveness probe, if at least failureThreshold probes havefailed, Kubernetes treats the container as unhealthy and triggers a restart for thatspecific container. The kubelet honors the setting of terminationGracePeriodSecondsfor that container.For a failed readiness probe, the kubelet continues running the container that failedchecks, and also continues to run more probes; because the check failed, the kubeletsets the Ready conditionon the Pod to false.
terminationGracePeriodSeconds: configure a grace period for the kubelet to wait betweentriggering a shut down of the failed container, and then forcing the container runtime to stopthat container.The default is to inherit the Pod-level value for terminationGracePeriodSeconds(30 seconds if not specified), and the minimum value is 1.See probe-level terminationGracePeriodSecondsfor more detail.

Caution: Incorrect implementation of readiness probes may result in an ever growing numberof processes in the container, and resource starvation if this is left unchecked.

HTTP probes

HTTP probeshave additional fields that can be set on httpGet:

host: Host name to connect to, defaults to the pod IP. You probably want toset "Host" in httpHeaders instead.
scheme: Scheme to use for connecting to the host (HTTP or HTTPS). Defaults to "HTTP".
path: Path to access on the HTTP server. Defaults to "/".
httpHeaders: Custom headers to set in the request. HTTP allows repeated headers.
port: Name or number of the port to access on the container. Number must bein the range 1 to 65535.

For an HTTP probe, the kubelet sends an HTTP request to the specified port andpath to perform the check. The kubelet sends the probe to the Pod's IP address,unless the address is overridden by the optional host field in httpGet. Ifscheme field is set to HTTPS, the kubelet sends an HTTPS request skipping thecertificate verification. In most scenarios, you do not want to set the host field.Here's one scenario where you would set it. Suppose the container listens on 127.0.0.1and the Pod's hostNetwork field is true. Then host, under httpGet, should be setto 127.0.0.1. If your pod relies on virtual hosts, which is probably the more commoncase, you should not use host, but rather set the Host header in httpHeaders.

For an HTTP probe, the kubelet sends two request headers in addition to the mandatory Host header:

User-Agent: The default value is kube-probe/1.30,where 1.30 is the version of the kubelet.
Accept: The default value is */*.

You can override the default headers by defining httpHeaders for the probe.For example:

livenessProbe: httpGet: httpHeaders: - name: Accept value: application/jsonstartupProbe: httpGet: httpHeaders: - name: User-Agent value: MyUserAgent

You can also remove these two headers by defining them with an empty value.

livenessProbe: httpGet: httpHeaders: - name: Accept value: ""startupProbe: httpGet: httpHeaders: - name: User-Agent value: ""

Note:

When the kubelet probes a Pod using HTTP, it only follows redirects if the redirect
is to the same host. If the kubelet receives 11 or more redirects during probing, the probe is considered successfuland a related Event is created:

Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 29m default-scheduler Successfully assigned default/httpbin-7b8bc9cb85-bjzwn to daocloud Normal Pulling 29m kubelet Pulling image "docker.io/kennethreitz/httpbin" Normal Pulled 24m kubelet Successfully pulled image "docker.io/kennethreitz/httpbin" in 5m12.402735213s Normal Created 24m kubelet Created container httpbin Normal Started 24m kubelet Started container httpbin Warning ProbeWarning 4m11s (x1197 over 24m) kubelet Readiness probe warning: Probe terminated redirects

If the kubelet receives a redirect where the hostname is different from the request, the outcome of the probe is treated as successful and kubelet creates an event to report the redirect failure.

TCP probes

For a TCP probe, the kubelet makes the probe connection at the node, not in the Pod, whichmeans that you can not use a service name in the host parameter since the kubelet is unableto resolve it.

Probe-level `terminationGracePeriodSeconds`

FEATURE STATE: Kubernetes v1.28 [stable]

In 1.25 and above, users can specify a probe-level terminationGracePeriodSecondsas part of the probe specification. When both a pod- and probe-levelterminationGracePeriodSeconds are set, the kubelet will use the probe-level value.

When setting the terminationGracePeriodSeconds, please note the following:

The kubelet always honors the probe-level terminationGracePeriodSeconds field ifit is present on a Pod.
If you have existing Pods where the terminationGracePeriodSeconds field is set andyou no longer wish to use per-probe termination grace periods, you must deletethose existing Pods.