skip to Main Content

In my Kubernetes cluster, I have a single pod (i.e. one replica) with two containers: server and cache.

I also have a Kubernetes Service that matches my pod.

If cache is crashing, when I try to send an HTTP request to server via my Service, I get a "503 Service Temporarily Unavailable".

The HTTP request is going into the cluster via Nginx Ingress, and I suspect that the problem is that when cache is crashing, Kubernetes removes my one pod from the Service load balancers, as promised in the Kubernetes documentation:

The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.

I don’t prefer this behavior, since I still want to be able server to respond to requests even if cache has failed. Is there any way to get this desired behavior?

2

Answers


  1. Chosen as BEST ANSWER

    The behavior I am looking for is configurable on the Service itself via the publishNotReadyAddresses option:

    https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.21/#servicespec-v1-core


  2. A POD is brought to the "Failed" state if one of the following conditions occur

    • One of its containers exit with non-zero status
    • Kubernates terminates a container due to health checker failing

    So, if you need one of your containers to still respond when another one fails,

    1. Make sure your liveliness probe is pointed to the container you need to be continuing. The health checker will get success code always and will not mark the POD as "Failed"

    2. Make sure the readiness probe is pointed to the container you neesd to be continuing. This will make sure that the load balancer will still send the traffic to your pod.

    3. Make sure that you handle the container errors gracefully and make them exit with zero status code.

    In the following example readiness and liveliness probes, make sure that the port 8080 is handled by the service container and it has the /healthz and /ready routes active.

        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 1
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search