I’ve got issue with two pods that is designated for React SPA. Both of them are in state of constant failing. Since I’m not the person who designed it, I’m kinda lost.
ON the failing pods there’s simple yarn start command and log from it is this:
kubectl logs XXX-frontend-dev-697347347-2lscc -n XXX-dev -p
yarn start v0.24.4
$ touch .env && node ./index.js
Warning: Accessing PropTypes via the main React package is deprecated, and will be removed in React v16.0. Use the latest available v15.* prop-types package from npm instead. For info on usage, compatibility, migration and more, see ...
Server started on port 3000
Next kind of information is this:
kubectl describe pods XXX-frontend-dev-697347347-2lscc -n XXX-dev
Name: XXX-frontend-dev-697347347-2lscc
Namespace: XXX-dev
Node: ip-172-18-111-111.ec2.internal/172.18.111.111
Start Time: Tue, 20 Feb 2018 21:12:51 -0300
Labels: app=XXX-frontend-dev
pod-template-hash=697347347
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"XXX-dev","name":"XXX-frontend-dev-697347347","uid":"9ba076a2-1689-11e8-...
Status: Running
IP: XXX
Controllers: ReplicaSet/XXX-frontend-dev-697347347
Containers:
XXX-frontend-dev:
Container ID: docker://686f3bc4f856572af1acc7e2b883c43815965dd45c264076e144ccb7a55f7902
Image: XXX.dkr.ecr.us-east-1.amazonaws.com/XXX-frontend:67b1be2753c03a187297c63879d2502ab2f59979
Image ID: docker-pullable://XXX.dkr.ecr.us-east-1.amazonaws.com/XXX-frontend@sha256:ae2da58fe09a874a7cf18f85d2ac27b5a26fdd664d9c536fdf7d1a802755cbc6
Port: 3000/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Tue, 20 Feb 2018 21:33:41 -0300
Ready: False
Restart Count: 9
Limits:
cpu: 1
memory: 500Mi
Requests:
cpu: 600m
memory: 228Mi
Liveness: http-get http://:3000/healthz delay=30s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:3000/healthz delay=30s timeout=1s period=10s #success=1 #failure=3
Environment:
NODE_ENV: development
API_BASE_URI: https://XXX
APP_ORIGIN: https://XXX
CMS_BASE_URI: https://XXX
FACEBOOK_API_KEY: <set to the key 'api-key' in secret 'facebook-credentials'> Optional: false
FACEBOOK_API_SECRET: <set to the key 'api-secret' in secret 'facebook-credentials'> Optional: false
TWITTER_API_KEY: <set to the key 'api-key' in secret 'twitter-credentials'> Optional: false
TWITTER_API_SECRET: <set to the key 'api-secret' in secret 'twitter-credentials'> Optional: false
GOOGLE_API_KEY: <set to the key 'api-key' in secret 'google-credentials'> Optional: false
GOOGLE_API_SECRET: <set to the key 'api-secret' in secret 'google-credentials'> Optional: false
STRIPE_PUBLIC_KEY: <set to the key 'public-key' in secret 'stripe-credentials'> Optional: false
SHOPIFY_APP_KEY: <set to the key 'app-key' in secret 'shopify-credentials'> Optional: false
SHOPIFY_APP_SECRET: <set to the key 'app-secret' in secret 'shopify-credentials'> Optional: false
REDIS_HOST: <set to the key 'host' in secret 'redis-credentials'> Optional: false
SUMOLOGIC_ENDPOINT: <set to the key 'endpoint' in secret 'sumologic-credentials'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-cr2kn (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-cr2kn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-cr2kn
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
25m 25m 1 default-scheduler Normal Scheduled Successfully assigned XXX-frontend-dev-697347347-2lscc to ip-172-18-111-111.ec2.internal
25m 25m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Pulling pulling image "XXX.dkr.ecr.us-east-1.amazonaws.com/XXX-frontend:67b1be2753c03a187297c63879d2502ab2f59979"
25m 25m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Pulled Successfully pulled image "XXX.dkr.ecr.us-east-1.amazonaws.com/XXX-frontend:67b1be2753c03a187297c63879d2502ab2f59979"
25m 25m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id 388eda82061c5b53f62aab9c00d2a10cc19f64d6bf0365649c61a4f414c47005
25m 25m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id 388eda82061c5b53f62aab9c00d2a10cc19f64d6bf0365649c61a4f414c47005
23m 23m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://388eda82061c5b53f62aab9c00d2a10cc19f64d6bf0365649c61a4f414c47005:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
23m 23m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id f31c4e29097e2467a6e7e4d18610556781586df35fa0452ad262cf28f79aa06f
23m 23m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id f31c4e29097e2467a6e7e4d18610556781586df35fa0452ad262cf28f79aa06f
22m 22m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://f31c4e29097e2467a6e7e4d18610556781586df35fa0452ad262cf28f79aa06f:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
22m 22m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id f6ef675027fdb9668464c8177437f06a444e6a0aec1f5456681a0d34b3c4a246
22m 22m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id f6ef675027fdb9668464c8177437f06a444e6a0aec1f5456681a0d34b3c4a246
21m 21m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://f6ef675027fdb9668464c8177437f06a444e6a0aec1f5456681a0d34b3c4a246:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
21m 21m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id 2e2c8e96af12ad119c4484f86565047ec19e9bd9fe7b2965486e016a9117eb68
21m 21m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id 2e2c8e96af12ad119c4484f86565047ec19e9bd9fe7b2965486e016a9117eb68
20m 20m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://2e2c8e96af12ad119c4484f86565047ec19e9bd9fe7b2965486e016a9117eb68:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
20m 20m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id 8e05762ab7ff150bb2ce120ad3ae9c42ab3ceb43edcdf41eda1cb21a06243713
20m 20m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id 8e05762ab7ff150bb2ce120ad3ae9c42ab3ceb43edcdf41eda1cb21a06243713
19m 19m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id d9d37841cddabd5e50b62b7cce50c07fc5adaa1d713e611a11dda713009d9e29
19m 19m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://8e05762ab7ff150bb2ce120ad3ae9c42ab3ceb43edcdf41eda1cb21a06243713:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
19m 19m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id d9d37841cddabd5e50b62b7cce50c07fc5adaa1d713e611a11dda713009d9e29
18m 18m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://d9d37841cddabd5e50b62b7cce50c07fc5adaa1d713e611a11dda713009d9e29:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
18m 17m 7 kubelet, ip-172-18-111-111.ec2.internal Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "XXX-frontend-dev" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=XXX-frontend-dev pod=XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)"
17m 17m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id a12ed013b3a94e756496dfb440a46b2c0a534554abe1f7516812c0ebc53c3c74
16m 16m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id a12ed013b3a94e756496dfb440a46b2c0a534554abe1f7516812c0ebc53c3c74
16m 16m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://a12ed013b3a94e756496dfb440a46b2c0a534554abe1f7516812c0ebc53c3c74:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
16m 13m 14 kubelet, ip-172-18-111-111.ec2.internal Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "XXX-frontend-dev" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=XXX-frontend-dev pod=XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)"
13m 13m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Created Created container with id e3abfac0eb9b5059901d660107ef58568108f5cdb7136855da1c04bb675f4c56
13m 13m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Started Started container with id e3abfac0eb9b5059901d660107ef58568108f5cdb7136855da1c04bb675f4c56
12m 12m 1 kubelet, ip-172-18-111-111.ec2.internal spec.containers{XXX-frontend-dev} Normal Killing Killing container with id docker://e3abfac0eb9b5059901d660107ef58568108f5cdb7136855da1c04bb675f4c56:pod "XXX-frontend-dev-697347347-2lscc_XXX-dev(f485e721-169b-11e8-8888-0a5e020b5188)" container "XXX-frontend-dev" is unhealthy, it will be killed and re-created.
Basically, I discovered that the 137 code could be because of SIGTERM. It could be because of low memory. But I don’t know how to set it up using kops or helm. How should I be thinking in this case? I’m jammed on this problem for few days and really don’t know where to go from here.
2
Answers
Tis tells you that the process running inside the container exited with an error. This is hardly something to debug on kube level, rather then that, it seems like the container might be missing something or be missconfigured. Check the logs of the failing container for hints, enable debugging logs if possible. Otherwise you can also change the default command/entrypoint to something like
sleep 1d
so the container starts and stays up and thenkubectl exec
into the pod to execute the actual software inside in an interactive session to betterdebug why it fails.Sidenote: if pod is terminated due to memory limits then you should see OOMKill in describe, unless the oomkill happens not from limits but more from general node exhaustion (not reached the pod limit but reached node capacity)
It tells the pod is unhealthy, so probe failures could be the cause.
Please make sure the probe definitions are correct. http://:3000? and make sure initialDelaySeconds and timeoutSeconds are long enough (1sec can be a bit short for timeout, and sometime 30sec may not be enough).
Configure Liveness and Readiness Probes
Possibilities are the pod took more than 30 sec to startup so K8S thought it failed to start, and killed to restart it, or the health URL is incorrect or 1 sec is not enough and similarly K89S thinks the pod failed, hence killing/restarting.
Please check the kubelet log as well. If also tells a lot if there are errors such as volume mount failures, etc.