Nginx-ingress-controller fails to start after AKS upgrade to v1.22

Rychu
January 29, 2022
380 views
0 votes
2 Answers

We performed our kubernetes cluster upgrade from v1.21 to v1.22. After this operation we discovered that our nginx-ingress-controller deployment’s pods are failing to start with the following error message:
pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.Ingress: the server could not find the requested resource

We have found out that this issue is tracked over here: https://github.com/bitnami/charts/issues/7264

Because azure doesn’t let to downgrade the cluster back to the 1.21 could you please help us fixing the nginx-ingress-controller deployment? Could you please be specific with what should be done and from where (local machine or azure cli, etc) as we are not very familiar with helm.

This is our deployment current yaml:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx-ingress-controller
  namespace: ingress
  uid: 575c7699-1fd5-413e-a81d-b183f8822324
  resourceVersion: '166482672'
  generation: 16
  creationTimestamp: '2020-10-10T10:20:07Z'
  labels:
    app: nginx-ingress
    app.kubernetes.io/component: controller
    app.kubernetes.io/managed-by: Helm
    chart: nginx-ingress-1.41.1
    heritage: Helm
    release: nginx-ingress
  annotations:
    deployment.kubernetes.io/revision: '2'
    meta.helm.sh/release-name: nginx-ingress
    meta.helm.sh/release-namespace: ingress
  managedFields:
    - manager: kube-controller-manager
      operation: Update
      apiVersion: apps/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          f:replicas: {}
      subresource: scale
    - manager: Go-http-client
      operation: Update
      apiVersion: apps/v1
      time: '2020-10-10T10:20:07Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:meta.helm.sh/release-name: {}
            f:meta.helm.sh/release-namespace: {}
          f:labels:
            .: {}
            f:app: {}
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/managed-by: {}
            f:chart: {}
            f:heritage: {}
            f:release: {}
        f:spec:
          f:progressDeadlineSeconds: {}
          f:revisionHistoryLimit: {}
          f:selector: {}
          f:strategy:
            f:rollingUpdate:
              .: {}
              f:maxSurge: {}
              f:maxUnavailable: {}
            f:type: {}
          f:template:
            f:metadata:
              f:labels:
                .: {}
                f:app: {}
                f:app.kubernetes.io/component: {}
                f:component: {}
                f:release: {}
            f:spec:
              f:containers:
                k:{"name":"nginx-ingress-controller"}:
                  .: {}
                  f:args: {}
                  f:env:
                    .: {}
                    k:{"name":"POD_NAME"}:
                      .: {}
                      f:name: {}
                      f:valueFrom:
                        .: {}
                        f:fieldRef: {}
                    k:{"name":"POD_NAMESPACE"}:
                      .: {}
                      f:name: {}
                      f:valueFrom:
                        .: {}
                        f:fieldRef: {}
                  f:image: {}
                  f:imagePullPolicy: {}
                  f:livenessProbe:
                    .: {}
                    f:failureThreshold: {}
                    f:httpGet:
                      .: {}
                      f:path: {}
                      f:port: {}
                      f:scheme: {}
                    f:initialDelaySeconds: {}
                    f:periodSeconds: {}
                    f:successThreshold: {}
                    f:timeoutSeconds: {}
                  f:name: {}
                  f:ports:
                    .: {}
                    k:{"containerPort":80,"protocol":"TCP"}:
                      .: {}
                      f:containerPort: {}
                      f:name: {}
                      f:protocol: {}
                    k:{"containerPort":443,"protocol":"TCP"}:
                      .: {}
                      f:containerPort: {}
                      f:name: {}
                      f:protocol: {}
                  f:readinessProbe:
                    .: {}
                    f:failureThreshold: {}
                    f:httpGet:
                      .: {}
                      f:path: {}
                      f:port: {}
                      f:scheme: {}
                    f:initialDelaySeconds: {}
                    f:periodSeconds: {}
                    f:successThreshold: {}
                    f:timeoutSeconds: {}
                  f:resources:
                    .: {}
                    f:limits: {}
                    f:requests: {}
                  f:securityContext:
                    .: {}
                    f:allowPrivilegeEscalation: {}
                    f:capabilities:
                      .: {}
                      f:add: {}
                      f:drop: {}
                    f:runAsUser: {}
                  f:terminationMessagePath: {}
                  f:terminationMessagePolicy: {}
              f:dnsPolicy: {}
              f:restartPolicy: {}
              f:schedulerName: {}
              f:securityContext: {}
              f:serviceAccount: {}
              f:serviceAccountName: {}
              f:terminationGracePeriodSeconds: {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: apps/v1
      time: '2022-01-24T01:23:22Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          f:conditions:
            .: {}
            k:{"type":"Available"}:
              .: {}
              f:type: {}
            k:{"type":"Progressing"}:
              .: {}
              f:type: {}
    - manager: Mozilla
      operation: Update
      apiVersion: apps/v1
      time: '2022-01-28T23:18:41Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          f:template:
            f:spec:
              f:containers:
                k:{"name":"nginx-ingress-controller"}:
                  f:resources:
                    f:limits:
                      f:cpu: {}
                      f:memory: {}
                    f:requests:
                      f:cpu: {}
                      f:memory: {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: apps/v1
      time: '2022-01-28T23:29:49Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            f:deployment.kubernetes.io/revision: {}
        f:status:
          f:conditions:
            k:{"type":"Available"}:
              f:lastTransitionTime: {}
              f:lastUpdateTime: {}
              f:message: {}
              f:reason: {}
              f:status: {}
            k:{"type":"Progressing"}:
              f:lastTransitionTime: {}
              f:lastUpdateTime: {}
              f:message: {}
              f:reason: {}
              f:status: {}
          f:observedGeneration: {}
          f:replicas: {}
          f:unavailableReplicas: {}
          f:updatedReplicas: {}
      subresource: status
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-ingress
      app.kubernetes.io/component: controller
      release: nginx-ingress
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx-ingress
        app.kubernetes.io/component: controller
        component: controller
        release: nginx-ingress
    spec:
      containers:
        - name: nginx-ingress-controller
          image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
          args:
            - /nginx-ingress-controller
            - '--default-backend-service=ingress/nginx-ingress-default-backend'
            - '--election-id=ingress-controller-leader'
            - '--ingress-class=nginx'
            - '--configmap=ingress/nginx-ingress-controller'
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
            - name: https
              containerPort: 443
              protocol: TCP
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
          resources:
            limits:
              cpu: 300m
              memory: 512Mi
            requests:
              cpu: 200m
              memory: 256Mi
          livenessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            timeoutSeconds: 1
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            timeoutSeconds: 1
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
          securityContext:
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - ALL
            runAsUser: 101
            allowPrivilegeEscalation: true
      restartPolicy: Always
      terminationGracePeriodSeconds: 60
      dnsPolicy: ClusterFirst
      serviceAccountName: nginx-ingress
      serviceAccount: nginx-ingress
      securityContext: {}
      schedulerName: default-scheduler
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
status:
  observedGeneration: 16
  replicas: 3
  updatedReplicas: 2
  unavailableReplicas: 3
  conditions:
    - type: Available
      status: 'False'
      lastUpdateTime: '2022-01-28T22:58:07Z'
      lastTransitionTime: '2022-01-28T22:58:07Z'
      reason: MinimumReplicasUnavailable
      message: Deployment does not have minimum availability.
    - type: Progressing
      status: 'False'
      lastUpdateTime: '2022-01-28T23:29:49Z'
      lastTransitionTime: '2022-01-28T23:29:49Z'
      reason: ProgressDeadlineExceeded
      message: >-
        ReplicaSet "nginx-ingress-controller-59d9f94677" has timed out
        progressing.

Answers

Chosen as BEST ANSWER
- Rychu
- February 3, 2022 at 4:30 pm
- 0 votes
0
@Philip Welz's answer is the correct one of course. It was necessary to upgrade the ingress controller because of the removed v1beta1 Ingress API version in Kubernetes v1.22. But that's not the only problem we faced so I've decided to make a "very very short" guide of how we have finally ended up with a healthy running cluster (5 days later) so it may save someone else the struggle.

1. Upgrading nginx-ingress-controller version in YAML file.

Here we simply changed the version in the yaml file from:
```
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1
```
to
```
image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v1.1.1
```
After this operation, a new pod in v1.1.1 was spawned. It started nicely and was running healthy. Unfortunately that didn't bring our microservices back online. Now I know it was probably because of some changes that had to be done to the existing ingresses yaml files to make them compatible with the new version of the ingress controller. So go directly to step 2. now (two headers below).

Don't do this step for now, do only when step 2 failed for you: Reinstall nginx-ingress-controller

We decided that in this situation we will reinstall the controller from scratch following Microsoft's official documentation: https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli. Be aware that this will probably change the external IP address of your ingress controller. The easiest way in our case was to just remove the whole ingress namespace:
```
kubectl delete namespace ingress
```
That unfortunately doesn't remove the ingress class so the additional is required:
```
kubectl delete ingressclass nginx --all-namespaces
```
Then install the new controller:
```
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx --create-namespace --namespace ingress 
```
If you reinstalled nginx-ingress-controller or IP address changed after the upgrade in step 1.: Update your Network security groups, Load Balancers and domain DNS

In your AKS resource group should be a resource of type Network security group. It contains inbound and outbound security rules (I understand it works as a firewall). There should be a default network security group that is automatically managed by Kubernetes and the IP address should be automatically refreshed there.

Unfortunately, we also had an additional custom one. We had to update the rules manually there.

In the same resource group there should be a resource of Load balancer type. In the Frontend IP configuration tab double check if the IP address reflects your new IP address. As a bonus you can double check in the Backend pools tab that the addresses there match your internal node IPs.

Lastly don't forget to adjust your domain DNS records.

2. Upgrade your ingress yaml configuration files to match syntax changes

That took us a while to determine a working template but actually installing the helloworld application from, the mentioned above, Microsoft's tutorial helped us a lot. We started from this:
```
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
  name: hello-world-ingress
  namespace: services
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/rewrite-target: /$1
    nginx.ingress.kubernetes.io/ssl-redirect: 'false'
    nginx.ingress.kubernetes.io/use-regex: 'true'
  rules:
    - http:
        paths:
          - path: /hello-world-one(/|$)(.*)
            pathType: Prefix
            backend:
              service:
                name: aks-helloworld-one
                port:
                  number: 80
```
And after introducing changes incrementally we finally made it to the below. But I'm pretty sure the issue was that we were missing the nginx.ingress.kubernetes.io/use-regex: 'true' entry:
```
kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
  name: example-api
  namespace: services
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/configuration-snippet: |
      more_set_headers "X-Forwarded-By: example-api";
    nginx.ingress.kubernetes.io/rewrite-target: /example-api
    nginx.ingress.kubernetes.io/ssl-redirect: 'true'
    nginx.ingress.kubernetes.io/use-regex: 'true'
spec:
  tls:
    - hosts:
        - services.example.com
      secretName: tls-secret
  rules:
    - host: services.example.com
      http:
        paths:
          - path: /example-api
            pathType: ImplementationSpecific
            backend:
              service:
                name: example-api
                port:
                  number: 80
```
Just in case someone would like to install, for testing purposes, the helloworld app then yamls looked as follows:
```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: aks-helloworld-one  
spec:
  replicas: 1
  selector:
    matchLabels:
      app: aks-helloworld-one
  template:
    metadata:
      labels:
        app: aks-helloworld-one
    spec:
      containers:
      - name: aks-helloworld-one
        image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
        ports:
        - containerPort: 80
        env:
        - name: TITLE
          value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
  name: aks-helloworld-one  
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: aks-helloworld-one
```
3. Deal with other crashing applications ...

Another application that was crashing in our cluster was cert-manager. This was in version 1.0.1 so, first, we upgraded this to version 1.1.1:
```
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --namespace cert-manager --version 1.1 cert-manager jetstack/cert-manager
```
That created a brand new healthy pod. We were happy and decided to stay with v1.1 because we were a bit scared about additional measures that have to be taken when upgrading to higher versions (check at the bottom of this page https://cert-manager.io/docs/installation/upgrading/).

The cluster is now finally fixed. It is, right?

4. ... but be sure to check the compatibility charts!

Well.. now we know that the cert-manager is compatible with Kubernetes v1.22 only starting from version 1.5. We were so unlucky that exactly that night our SSL certificate passed 30 days threshold from the expiration date so the cert-manager decided to renew the cert! The operation failed and the cert-manager crashed. Kubernetes fallback to the "Kubernetes Fake Certificate". The web page went down again because of browsers killing the traffic because of the invalid certificate. The fix was to upgrade to 1.5 and upgrade the CRDs as well:
```
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.5.4/cert-manager.crds.yaml
helm upgrade --namespace cert-manager --version 1.5 cert-manager jetstack/cert-manager
```
After this, the new instance of cert-manager refreshed our certificate successfully. Cluster saved again.

In case you need to force the renewal you can take a look at this issue: https://github.com/jetstack/cert-manager/issues/2641

@ajcann suggests adding renewBefore property to the certificates:
```
kubectl get certs --no-headers=true | awk '{print $1}' | xargs -n 1 kubectl patch certificate --patch '
- op: replace
  path: /spec/renewBefore
  value: 1440h
' --type=json
```
Then wait for the certificates to renew and then remove the property:
```
kubectl get certs --no-headers=true | awk '{print $1}' | xargs -n 1 kubectl patch certificate --patch '
- op: remove
  path: /spec/renewBefore
' --type=json
```

(Edit)

- PhilipWelz
- January 31, 2022 at 9:24 pm
- 0 votes
0
Kubernetes 1.22 is supported only with NGINX Ingress Controller 1.0.0 and higher = https://github.com/kubernetes/ingress-nginx#support-versions-table

You need tu upgrade your nginx-ingress-controller Bitnami Helm Chart to Version 9.0.0 in Chart.yaml. Then run a helm upgrade nginx-ingress-controller bitnami/nginx-ingress-controller.

You should also regularly update specially your ingress controller, as the version v0.34.1 is very very old bcs the ingress is normally the only entry appoint from outside to your cluster.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Nginx-ingress-controller fails to start after AKS upgrade to v1.22

Answers

1. Upgrading nginx-ingress-controller version in YAML file.

Don't do this step for now, do only when step 2 failed for you: Reinstall nginx-ingress-controller

If you reinstalled nginx-ingress-controller or IP address changed after the upgrade in step 1.: Update your Network security groups, Load Balancers and domain DNS

2. Upgrade your ingress yaml configuration files to match syntax changes

3. Deal with other crashing applications ...

4. ... but be sure to check the compatibility charts!