skip to Main Content

I installed Kubernetes ingress controller on GKE following the official documentation as following.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.46.0/deploy/static/provider/cloud/deploy.yaml

The ingress controller runs fine.

ingress-nginx-admission-create-dvkgp        0/1     Completed   0          5h29m
ingress-nginx-admission-patch-58l4z         0/1     Completed   1          5h29m
ingress-nginx-controller-65d7564f46-2rtjs   1/1     Running     0          5h29m

It creates a TCP load balancer, health checkup and firewall rules automatically. My kubernetes cluster has 3 nodes. Interestingly, the health checkup fails for 2 instances. It passes for the instance where the ingress controller is running. I debug it but didn’t find any clue. Could someone help me on this.

2

Answers


  1. One of the possible reason is Firewall rules. Google has specified IP range and port details of Google Health Check probers. You have to configure ingress allow rule to establish health check probing connection to your backend.

    For additional debugging details check this Google Cloud Platform blog: Debugging Health Checks in Load Balancing on Google Compute Engine

    Login or Signup to reply.
  2. If you were to look into the deploy.yaml you applied you would see:

    apiVersion: v1
    kind: Service
    metadata:
      name: ingress-nginx-controller
      namespace: ingress-nginx
    spec:
      type: LoadBalancer
      externalTrafficPolicy: Local
    

    Notice the externalTrafficPolicy: Local. It is being used to Preserve the client source ip.

    It’s even better explained here: Source IP for Services with Type=LoadBalancer

    From k8s docs:

    However, if you’re running on Google Kubernetes Engine/GCE, setting the same service.spec.externalTrafficPolicy field to Local forces nodes without Service endpoints to remove themselves from the list of nodes eligible for loadbalanced traffic by deliberately failing health checks.

    These health checkups are designed to fail. It works that way so that client IPs can be preserved.

    Notice that the one node that is listed as healthy is the one where ingress-nginx-controller pod runs. Delete this pod and wait for it to reschedule on a different node – now this other node should be healthy. Now run 3 pod replicas, one on every node and all nodes will be healthy.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search