I would like to have a Python script that simply creates a file run at intervals in Kubernetes. First, I used a docker image to run a python script on a one-time basis, but I got an error.
create_txt.py
import numpy as np
import datetime
res = np.random.rand(1)[0]
res = np.round(res,3) * 1000
with open(f'/home/sjw/kube/{str(int(res))}.txt','w') as f:
txt = datetime.datetime.now().strftime("%H:%M:%S")
f.write(txt)
Dockerfile
FROM python:3
WORKDIR /home/sjw/kube
COPY create_txt.py ./
RUN pip install numpy
CMD ["python","./create_txt.py"]
First, I uploaded the image to Docker Hub.
below is manifest
apiVersion: v1
kind: Pod
metadata:
name: createinterval
spec:
containers:
- name: createinterval
image: idioluck/kube_create:v01
command: ["/bin/sh"]
args: ["python create_txt.py"]
volumeMounts:
- mountPath: /home/sjw/kube
name: testvol
volumes:
- name: testvol
hostPath:
path: /home/sjw/kube
type: DirectoryOrCreate
pod status is
NAME READY STATUS RESTARTS AGE
createinterval 0/1 CrashLoopBackOff 7 (90s ago) 12m
I want to finally use cronjob to run a python script that creates a file at regular intervals and saves the result to a connected local storage.
** attach output of kubectl describe pod createinterval
sjw@DESKTOP-O6E7MND:~/kube/docker_sample/kube_create_txt_interval$ kubectl describe pod createinterval
Name: createinterval
Namespace: default
Priority: 0
Node: minikube/192.168.49.2
Start Time: Mon, 05 Sep 2022 16:42:24 +0900
Labels: <none>
Annotations: <none>
Status: Running
IP: 172.17.0.4
IPs:
IP: 172.17.0.4
Containers:
createinterval:
Container ID: docker://89d2fd4597e445bfd11dace1e06ab325572d2e3072d14df9892b31ebbc7fa7d1
Image: idioluck/kube_create:v02
Image ID: docker-pullable://idioluck/kube_create@sha256:0868e3dc569c88641a3db05adbf2be9387609f9a0d184869ac939e80b93af5bb
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 05 Sep 2022 17:08:35 +0900
Finished: Mon, 05 Sep 2022 17:08:35 +0900
Ready: False
Restart Count: 10
Environment: <none>
Mounts:
/home/sjw/kube from testvol (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5n28q (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
testvol:
Type: HostPath (bare host directory volume)
Path: /home/sjw/kube
HostPathType: DirectoryOrCreate
kube-api-access-5n28q:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 27m default-scheduler Successfully assigned default/createinterval to minikube
Normal Pulling 27m kubelet Pulling image "idioluck/kube_create:v02"
Normal Pulled 26m kubelet Successfully pulled image "idioluck/kube_create:v02" in 4.633892322s
Normal Created 25m (x5 over 26m) kubelet Created container createinterval
Normal Started 25m (x5 over 26m) kubelet Started container createinterval
Normal Pulled 25m (x4 over 26m) kubelet Container image "idioluck/kube_create:v02" already present on machine
Warning BackOff 109s (x117 over 26m) kubelet Back-off restarting failed container
2
Answers
You don’t need to pass
command
andargs
in your Pod’s manifest. You’ve already defined these in yourDockerfile
.For running
CronJobs
in K8s, check out this official docPlease share logs of your Pod by running:
There are several significant problems in your Kubernetes setup; I’ll walk through those. You ultimately want to get into a setup like
In your original file:
You almost never want to create a bare Pod. There are some operational problems with them (you can’t change one at all once they’re created, for example) and if a node becomes overcommitted a Pod can move into Evicted status and not be replaced. For your use case you want a CronJob; for a more typical long-running server application you’ll typically want a Deployment.
These overwrite the
ENTRYPOINT
andCMD
in your Dockerfile, respectively, and I’d just delete them. This particular construction is broken and probably the actual cause of your error (though as always double-check thekubectl logs
of your pod): it looks for a file namedpython create_txt.py
including a space in the filename, then tries to execute that file as a shell script. If you had to overwrite it thencommand: [python, create_txt.py]
would be the simplest thing that worked.This directory is also the
WORKDIR
of your image, which means the volume mount is hiding all of the code in the image. You may be used to a feature of Docker named volumes where image content is copied into a volume on first use; this does not happen on Kubernetes (or for that matter in Docker if the image is updated, or with Docker bind mounts, or …) and I’d avoid relying on this capability. You should delete this mount.A
hostPath
volume picks up the named directory on whatever node the pod happens to be running on. If the pod is recreated on a different node, thehostPath
mount will get a different directory, and the original volume content will be…maybe not "lost" per se, but "misplaced". Again, you almost never want to usehostPath
volumes.You may want to reconsider this setup. "Files" turn out to be surprisingly hard to manage in Kubernetes. If you look at the list of Types of Volumes and more specifically the table in the PersistentVolume Volume Mode documentation, you’ll notice that notice of the volume types that are easier to get only support ReadWriteOnce access; this generally means you can’t use the same volume with multiple replicas of your application pod or this cron job and your application at the same time. (Technically it can require all of the replicas to be scheduled on the same node, but they might not fit there and you often want to protect against single-node failure.)
Restructuring this workflow to "create a record in a database" or "make an HTTP call to a special backend endpoint" won’t have this problem.