skip to Main Content

We have a stateful redis deployment on multi node Kubernetes cluster. (v1.27.15)

There are two services named "redis" and "redis-headless"

There are 3 nodes in the cluster. When we shutdown one of the nodes, redis on that node becomes terminating:

kubectl get pods -A -o wide | grep redis
mynamespace redis-node-0  3/3     Running     0   8m  10.244.248.4    ha3-node2
mynamespace redis-node-1  3/3     Terminating 0  68m  10.244.230.119  ha3-node1
mynamespace redis-node-2  3/3     Running     0  67m  10.244.192.208  ha3-node3

But for redis-headless service 10.244.230.119 is still in endpoints

kubectl describe endpoints  -n mynamespace redis-headless
Name:         redis-headless
Namespace:    mynamespace
Subsets:
  Addresses:          10.244.192.208,10.244.230.119,10.244.248.4

For redis service (clusterIP) endpoints are OK. (10.244.230.119 is deleted from endpoints)

Is this behaviour normal for headless service, if not what is the solution?

Regards,

Yavuz

2

Answers


  1. This is working as intended, this is how Kubernetes works. The pod deletion and the endpoint slice update processes are parallel and there is no guarantee that one will be updated before the other. Besides that, there are also all the ingress/load balancer backends that need to be updated with the new endpoint slice information, that also is not guaranteed to happen before the pod is stopped. This is the reason for our recommendation to use sleep in the prestop hook; this should resolve the issue.

    If the endpoint is removed before the containers receive the term signal, no new requests will arrive while the containers are terminating. If the containers start terminating before the endpoint is removed, then the pod will continue to receive requests. Then those requests will get “Connection timeout” or “Connection refused” errors as responses. Because the endpoint removal must propagate to every node in the cluster before it is complete, there is a high probability that the pod eviction process completes first.

    As per the learnk8s document by Daniele Polencic on Graceful shutdown in Kubernetes, which has detailed information.

    If you use Services of type Headless, CoreDNS will have to subscribe to changes to the endpoints and reconfigure itself every time an endpoint is added or removed.

    Login or Signup to reply.
  2. The Headless service likely has publishNotReadyAddresses set to true in the service manifest. If this is true, then the IP of that terminating pod can still be shown in the endpoint resource until it is fully terminated. Once the controller recreates the pod, the new IP will show.

    After all, the Headless service does not have the kube-proxy handle it. The client can directly connect to the Pods via Cluster DNS – that is the whole point.

    So it is normal in my opinion based on the above.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search