FoxuTech

How to Troubleshoot Kubernetes Service 503 error(Service Unavailable)

Troubleshoot Kubernetes Service 503 Service Unavailable Error

The HTTP 503 Service Unavailable error means that a website cannot be reached now because the server is not ready to handle the request. This could happen because it’s too busy, under maintenance, or something else which requires a deeper analysis.

In Kubernetes, it means a Service tried to route a request to a pod, but something went wrong along the way:

Running into errors on your site can be intimidating. However, most errors give you some hint as to what could cause them, which can helps to troubleshooting these issues in right way. The 503 error is not as polite, unfortunately, and does not give you much information to go on.

In this article, we have listed possible issue which cause 503 error and how you can troubleshoot the issue.

What Is an HTTP Error 503?

The Internet Engineering Task Force (IETF) defines the 503 Service Unavailable as:

The 503 (Service Unavailable) status code indicates that the server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay. The server MAY send a Retry-After header field to suggest an appropriate amount of time for the client to wait before retrying the request.

Troubleshooting Kubernetes Service 503 Errors

Fix 1: Check if the Pod Label Matches the Service Selector

A possible cause of 503 errors is that a Kubernetes pod does not have the expected label, and the Service selector does not identify it. If the Service does not find any matching pod, requests will return a 503 error.

Run the following command to see the current selector:

# kubectl describe service [service-name] -n [namespace-name]

Example output:

Name: [service-name]
Namespace: [pod-name]
Labels: none
Annotations: none
Selector: [label]
…

Note: Replace service_name with your service name and your_namespace with your service namespace.

The Selector section shows which label or labels are used to match the Service with pods.

Check if there are pods with this label:

# kubectl get pods -n your_namespace -l “[label]”

Fix 2: Verify that Pods Defined for the Service are Running

In step 1 we checked the label which is used by Service selector. Run the following command to ensure the pods matched by the selector are in Running state:

# kubectl -n your_namespace get pods -l “[label]”

Fix 3: Check Pods Pass the Readiness Probe for the Deployment

Next, we can check if a readiness probe is configured for the pod:

# kubectl describe pod pod-name -n namespace | grep -i readiness

This step provides helpful output only if the application is listening on the right path and port. Check the curl output with the curl -Ivk command, and make sure the path defined at the service level is getting a valid response. For example, 200 ms is a good response.

Readiness probe failed:

Fix 4: Verify that Instances are Registered with Load Balancer

If all the above steps did not find an issue, another common cause of 503 errors is that no instances are configured with the load balancer. Check the following:

This steps may helps you to discover the basic of the issues that can result in a Service 503 error. If you did not manage to quickly identify the root cause, you will need a more in-depth investigation across multiple components in the Kubernetes deployment. If there is more component malfunctioned, it will be hard to identify the exact cause. Another solution you consider is graceful shutdown.

Avoiding 503 with Graceful Shutdown

Another common cause of 503 errors is that when Kubernetes terminates a pod, containers on the pod drop existing connections. Clients then receive a 503 response. This can be resolved by implementing graceful shutdown.

To understand the concept of graceful shutdown, let’s quickly review how Kubernetes shuts down containers. When a user or the Kubernetes scheduler requests deletion of a pod, the kubelet running on a node first sends a SIGTERM signal via the Linux operating system.

The container can register a handler for SIGTERM and perform some clean-up activity before shutting down. Then, after a configurable grace period, Kubernetes sends a SIGKILL signal, and the container is forced to shut down.

Here are two ways to implement graceful shutdown to avoid a 503 error:

Exit mobile version