I’m trying to deploy an ephemeral postgres database for our development team to bring up and down without persisting any data. Currently during testing I’m using kubectl apply -f postgres
deployment of the following items.
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-secret
namespace: dev-cs
labels:
app: postgres
data:
POSTGRES_DB: postgres
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
#---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: dev-cs
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: 'postgres:latest'
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: postgres-secret
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgresdata
volumes:
- name: postgresdata
persistentVolumeClaim:
claimName: postgres-volume-claim
#---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-volume-claim
namespace: dev-cs
labels:
app: postgres
spec:
storageClassName: gp2
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
#---
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-volume
namespace: dev-cs
labels:
type: local
app: postgres
spec:
storageClassName: gp2
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /data/postgresql
#---
apiVersion: v1
kind: Service
metadata:
name: postgres
#namespace: dev-cs
labels:
app: postgres
spec:
type: NodePort
ports:
- port: 5432
selector:
app: postgres
I’m getting this error, and from googling it looks like the PV is not getting blown away after deleting the deployment.
2024-07-10 16:19:43.989 UTC [1] LOG: starting PostgreSQL 14.12 (Debian 14.12-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-07-10 16:19:43.989 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-07-10 16:19:43.989 UTC [1] LOG: listening on IPv6 address "::", port 5432
2024-07-10 16:19:43.992 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-07-10 16:19:43.995 UTC [26] LOG: database system was shut down at 2024-07-09 16:33:32 UTC
2024-07-10 16:19:43.996 UTC [26] LOG: invalid record length at 0/16FE650: wanted 24, got 0
2024-07-10 16:19:43.996 UTC [26] LOG: invalid primary checkpoint record
2024-07-10 16:19:43.996 UTC [26] PANIC: could not locate a valid checkpoint record
I have tried kubectl delete -f postgres
and redeploying a few times with postgres:14 and postgres:latest.
My thought is that the pv is re-using the same portion of memory on the ec2 node itself, as when I run the pv/pvc AWS is not provisioning a new ebs volume but seemingly using the same node ebs root volume attached to the ec2 node.
2
Answers
The behavior you describe is exactly what a
hostPath:
type volume does. On whichever node the Pod happens to be scheduled on, it uses that directory on the host. If a Pod gets deleted and recreated on another node, the old data will be, well, misplaced at least; if a second Pod gets created on the original node, it will find the older data directory.Unless you’re writing something like a log aggregator that runs as a DaemonSet on every node, you should almost never use a
hostPath:
volume.In most Kubernetes installations, you also don’t need to manually create PersistentVolumes. In EKS, if you create a PersistentVolumeClaim, the cluster will automatically create a PersistentVolume backed by an EBS volume. The cluster itself is capable of detaching the EBS volume from one node and reattaching it to another, so if the Pod is deleted and recreated somewhere else, the data will follow it.
In your setup, it might be enough to delete the manual PersistentVolume. (You may also have to delete and recreate the PersistentVolumeClaim, and you may have to delete any extant Pods as well.) You don’t seem to explicitly refer to the manually-created PersistentVolume.
Also consider using a StatefulSet here, and including the PVC setup in its
volumeClaimTemplates:
field. Since multiple replicas of a PostgreSQL database can’t share a data store, if you ever do wind up with more than one replica, you’ll need a separate PVC for each, which a StatefulSet can do automatically.You are using EBS as you database storage in your spec.
You can use
emptyDir
.hostPath
is a more advance use of storage which involve the file system on the host. Just change yourvolumes
will do:No other PVC/PV required. You get fresh volume every time you start a new pod, the volume automatically remove every time you delete the pod.