Issue
I have a Redis K8s deployment that links to a separate service, with a heavily reduced manifest as follows (if more info is needed that’s missing let me know):
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 2
selector:
matchLabels:
app: cache
environment: dev
template:
metadata:
labels:
app: cache
environment: dev
spec:
containers:
- name: cache
image: marketplace.gcr.io/google/redis5
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 30
timeoutSeconds: 5
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 30
timeoutSeconds: 5
volumes:
- name: data
nfs:
server: "nfs-server.recs-api.svc.cluster.local"
path: "/data"
I want to regularly redeploy Redis with a new dataset, instead of updating the existing cache.
When doing a kubectl rollout restart deployment/cache
, old Redis pods are Terminated before new Redis pods are ready to accept traffic. These new Redis pods are marked READY, and as expected the old ones are Terminated, however redis-cli ping
on new Redis pods returns (error) LOADING Redis is loading the dataset in memory
.
It currently takes 5-10 minutes for Redis to stop loading the dataset and be ready to accept connections, but by this point they’ve been READY for the same amount of time, with active traffic directed to them as old pods have been Terminated.
My suspicion is that because the status code for this response is 0, and so the readinessProbe
triggers READY 1/1
and kills the old pods, however I have not been able to find a suitable exec: command:
that avoids this issue.
redis-cli info
has a loading:0|1
line, and so I tested:
readinessProbe:
exec:
command: ["redis-cli", "info", "|", "grep loading:", "|", "grep 0"]
in the hope that for non 0 loading values, grep would provide a non-zero status code and fail the readinessProbe, but this didn’t seem to work and had the same behavior as redis-cli ping
with the prematurely terminating pods and loss of service until loading had completed.
What I want
- When deploying new Redis cache pods, I want there to be a pod ready to accept connections throughout, while the new Redis cache pods are loading dataset to memory
- Ideally in the form of a tidy readinessProbe check, but fully open to any suggestions!
- It’s also possible I’ve misunderstood the purpose of a readinessProbe so please let me know
- If possible, better understand why
redis-cli ping
or other readinessProbes were still triggering a READY state for the new pods, despite non-zero status codes onexec: command:
Thanks!
2
Answers
I have investigated bitnami/redis charts and find out how do they implement liveness/readiness probe.
From their charts, they create a health-configmap, which contains a shell script using redis-cli ping to health check redis server, and handle responses.
Here is the configmap defined:
And in deployment/statefulset, just set the probe to execute this shell script:
The following should work just fine
The key is
The whole snippet