skip to Main Content

I’m running Apache Openwhisk on k3s, installed using helm.

Below is the invoker logs, taken several hours after a fresh install, with several functions set to run periodically. This message appears every few seconds after the problem starts.

[2020-03-17T13:27:12.691Z] [ERROR] [#tid_sid_invokerHealth] [ContainerPool]
Rescheduling Run message, too many message in the pool, freePoolSize: 0 containers and 0 MB,
busyPoolSize: 8 containers and 4096 MB, maxContainersMemory 4096 MB, userNamespace: whisk.system,
action: ExecutableWhiskAction/whisk.system/[email protected], needed memory: 128 MB,
waiting messages: 24

Here are the running pods. Notice all the function pods have an age of 11+ hours.

NAME                                                              READY   STATUS      RESTARTS   AGE
openwhisk-gen-certs-n965b                                         0/1     Completed   0          14h
openwhisk-init-couchdb-4s9rh                                      0/1     Completed   0          14h
openwhisk-install-packages-pnvmq                                  0/1     Completed   0          14h
openwhisk-apigateway-78c64dd7c9-2gsw6                             1/1     Running     2          14h
openwhisk-couchdb-844c6df68f-qrxq6                                1/1     Running     2          14h
openwhisk-wskadmin                                                1/1     Running     2          14h
openwhisk-redis-77494b8d44-gkmlt                                  1/1     Running     2          14h
openwhisk-zookeeper-0                                             1/1     Running     2          14h
openwhisk-kafka-0                                                 1/1     Running     2          14h
openwhisk-controller-0                                            1/1     Running     2          14h
openwhisk-nginx-5f795dd747-c228s                                  1/1     Running     4          14h
openwhisk-cloudantprovider-69fd94b6f6-x88f4                       1/1     Running     2          14h
openwhisk-kafkaprovider-544fbfdcc7-kn29p                          1/1     Running     2          14h
openwhisk-alarmprovider-58c5454cc8-q4wbw                          1/1     Running     2          14h
openwhisk-invoker-0                                               1/1     Running     2          14h
wskopenwhisk-invoker-00-1-prewarm-nodejs10                        1/1     Running     0          14h
wskopenwhisk-invoker-00-6-prewarm-nodejs10                        1/1     Running     0          13h
wskopenwhisk-invoker-00-15-whisksystem-checkuserload              1/1     Running     0          13h
wskopenwhisk-invoker-00-31-whisksystem-guacscaleup                1/1     Running     0          12h
wskopenwhisk-invoker-00-30-whisksystem-guacscaledown              1/1     Running     0          12h
wskopenwhisk-invoker-00-37-whisksystem-functionelastalertcheckd   1/1     Running     0          11h
wskopenwhisk-invoker-00-39-whisksystem-checkuserload              1/1     Running     0          11h
wskopenwhisk-invoker-00-40-whisksystem-functionelastalertcheckd   1/1     Running     0          11h
wskopenwhisk-invoker-00-42-whisksystem-guacscaleup                1/1     Running     0          11h
wskopenwhisk-invoker-00-43-whisksystem-functionelastalertcheckd   1/1     Running     0          11h

Shouldn’t Openwhisk be killing these pods after they reach the timeout? The functions all have a timeout of either 3 or 5 minutes, but Openwhisk doesn’t seem to enforce this.

One other thing I noticed was “timeout” being set to “false” on the activations.

$ wsk activation get ...
{
   "annotations": [
        ...
        {
            "key": "timeout",
            "value": false
        },
        ...
}

2

Answers


  1. Chosen as BEST ANSWER

    Ok I fixed this by changing the invoker container factory implementation to docker. I'm not sure why the kubernetes implementation fails to kill pods (and release memory), but we are using docker as the container runtime for k3s.

    To set this, change invoker.containerFactory.impl to docker in the helm chart values: https://github.com/apache/openwhisk-deploy-kube/blob/master/helm/openwhisk/values.yaml#L261

    I also increased the invoker memory (invoker.jvmHeapMB) to 1024: https://github.com/apache/openwhisk-deploy-kube/blob/master/helm/openwhisk/values.yaml#L257

    Here is a link that explains the container factory setting: https://github.com/apache/openwhisk-deploy-kube/blob/master/docs/configurationChoices.md#invoker-container-factory


  2. The timeout annotation is specific to an particular activation. If the value is true it means that particular activation of the corresponding function exceeded its set maximum duration which is a range of values from 100 ms to 5 minutes by default (per the docs) unless changed for the system deployment as a whole.

    The pods are used to execute the functions – they will stick around for some duration while idle to facilitate future warm starts. The openwhisk invoker will terminate these warm pods eventually after an idle timeout, or when resource are required to run other pods.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search