I have a 10.3 postgres docker container in my production and localhost.
In a previous question, I had to restore a dump that was archived in 10.5. Thanks to the answer I use plainformat to do so. But this is a temporary solution.
I like to know if there’s an easy way to upgrade my postgres version for my docker container in localhost and production.
In localhost, i have many databases and schema for development and exploration purposes.
In production, there are far fewer but of course the data is far more important.
I like to upgrade to a new version of postgres without jeopardizing the data and schema.
In localhost, my host OS is macOS 10.15 catalina. In production, the host OS is ubuntu headless server edition 18.04
Both production and localhost use teh same Dockerfile config
FROM postgres:10.3
COPY ./maintenance /usr/local/bin/maintenance
RUN chmod +x /usr/local/bin/maintenance/*
RUN mv /usr/local/bin/maintenance/* /usr/local/bin
&& rmdir /usr/local/bin/maintenance
I did find this https://github.com/docker-library/postgres/issues/37#issuecomment-431317584 but I don’t have the conceptual understanding of what this comment is suggesting.
Also I found this library https://github.com/bwbroersma/docker-postgres-upgrade
Not sure how these two approaches are the same or different
Hope to get someone experienced with both Docker and Postgres for advice here.
What I have tried
This is my original local.yml
for docker. Local because for local development environment.
version: "3.7"
volumes:
postgres_data_local: {}
postgres_backup_local: {}
services:
django:
build:
context: .
dockerfile: ./compose/local/django/Dockerfile
image: eno-a3-django_local_django
depends_on:
- postgres
- mailhog
- redis
volumes:
- .:/app
env_file:
- ./.envs/.local/.django
- ./.envs/.local/.postgres
ports:
- "8000:8000"
command: /start
postgres:
build: ./compose/production/postgres/
image: eno-a3-django_production_postgres
volumes:
- postgres_data_local:/var/lib/postgresql/data
- postgres_backup_local:/backups
env_file:
- ./.envs/.local/.postgres
ports:
- "5432:5432"
mailhog:
image: mailhog/mailhog:v1.0.0
ports:
- "8025:8025"
redis:
build: ./compose/production/redis/
container_name: redis
restart: always
And then I thought I would create a new docker container.
So I changed to this
version: "3.7"
volumes:
postgres_data_local: {}
postgres_backup_local: {}
services:
django:
build:
context: .
dockerfile: ./compose/local/django/Dockerfile
image: eno-a3-django_local_django
depends_on:
- postgres
- mailhog
- redis
- postgres_new
volumes:
- .:/app
env_file:
- ./.envs/.local/.django
- ./.envs/.local/.postgres
ports:
- "8000:8000"
command: /start
postgres:
build: ./compose/production/postgres/
image: eno-a3-django_production_postgres
volumes:
- postgres_data_local:/var/lib/postgresql/data
- postgres_backup_local:/backups
env_file:
- ./.envs/.local/.postgres
ports:
- "5432:5432"
postgres_new:
build: ./compose/production/postgres_new/
image: eno-a3-django_production_postgres_new
volumes:
- postgres_data_local:/var/lib/postgresql/data
- postgres_backup_local:/backups
env_file:
- ./.envs/.local/.postgres_new
ports:
- "15432:5432"
mailhog:
image: mailhog/mailhog:v1.0.0
ports:
- "8025:8025"
redis:
build: ./compose/production/redis/
container_name: redis
restart: always
Notice how I use the same volumes for the postgres_new
container.
The Dockerfile for postgres_new
is
FROM postgres:10.13
COPY ./maintenance /usr/local/bin/maintenance
RUN chmod +x /usr/local/bin/maintenance/*
RUN mv /usr/local/bin/maintenance/* /usr/local/bin
&& rmdir /usr/local/bin/maintenance
When I run my docker build, and logged in using port 15432, I can see my old database schema etc.
It appears that both containers can share the same data via the same volume.
And then I restore into this Docker container using a 10.5 archive file and it succeeded.
My commands I use to restore are like this in my host OS
docker cp ~/path/to/10.5.dump eno-a3-django_postgres_new_1:/backups
docker exec eno-a3-django_postgres_new_1 pg_restore -U debug -d 1013-replicatelive /backups/10.5.dump
Because I thought both containers are talking to the same volume, but when I connect to the old postgres container via 5432, I noticed that the new database that I create via the new postgres container 10.13, it was not there.
And it appears to work. Can I simply destroy the older container without accidentally destroying my existing data?
However..
When I changed some database value in a database (that I thought is common in the new postgres container) via port 5432 hence via the old postgres container, the change was not seen in the corresponding database in the new postgres container.
After I commented out the old postgres container in local.yml
I then only docker up the new postgres container and making it use the host port 5432. I can now see both the new schema (restored via host port 15432) and also the changes in the common database schema. So I guess this method works.
But why does it work? Is it because the volume is reused?
3
Answers
DISCLAIMER: I’m not a Postgres expert, consider this answer coming from general Docker background
Containers and volumes (and images for that matter) are separate entities in Docker. You can share one volume between many containers, but since that pretty much equals to sharing filesystems – you should avoid having two different applications access one set of files concurrently. You also can delete containers without affecting your volumes or images (there are options to prune everything – there’s plenty information on SO about how to do that)
Why it worked
I am assuming that postgres loads DB list from
/var/lib/postgresql/data
on startup so it’s likely your changes to new database would not have propagated to other container straight away but ended up being visible after you restarted it.It appears that your example ended up working fine because you restored your backup into different database so no corruption has taken place. To me this seems like an accident.
The backup-restore
From what I can see in the github links you pointed at – both use separate volumes for
/var/lib/postgresql/data
(exactly to avoid concurrent modification) but share a volume for backup. Then they would dump old DB onto shared volume and pipe it through to new DB.Custom pg_upgrade image
This is where you build a container with both source and target versions and run the newer
pg_upgrade
as per official guide – that should perform the upgrade and write binary db files into location of your choice. You can then mount this data volume onto fresh postgres container.In place upgrade (minor versions)
Since
pg_upgrade
documentation claims it’s not needed for minor releases, it’s probably safe to assume that no file layout changes between those. Then you might not even need to spin up another container – just upgrade postgres image in yourdocker-compose
and keep using the old volume. This would save you some hassle. Having said that – this is probably your last choice with a lot of testing required.Disclaimer
Like timur, I am not a Postgres expert.
When possible get your DB managed by somebody else like a cloud provider.
In particular for production, I would strongly consider the advice of a database administrator for both the setup and your current operation. In cloud env they often offer managed databases with backup and support. This is more expensive but you don’t need to spend time trying to make things works and you don’t risk you prod data. This is something important to keep in mind.
Also I have heard many time that container are not yet there for performance for databases & volumes, in particular DB that have not been designed for that.
To fix your issue in local.
From why I understand, in test, this is easy, you can simply lock access to the DB, backup it, run another container with the version you want, restore the backup, check everything is fine, allows access to the DB through that new container and then delete the old container.
To fix issue in production
You should be able to do the same the problem is the downtime. You could put the DB in read only if that’s acceptable. But overall more than being a docker issue, this is a DB issue.
Check you postgres documentation on how to prevent downtime, perform upgrade on the go etc. Maybe you should have a cluster, update node per node or something. But in all case this isn’t really docker related. It would work the same with 3 container or 3 physical hosts.
Advice
Look at the postgres documentation on how to achieve what you want to achieve and do it like you where not using containers mostly. This should give you what you want.
The great thing is that on top you can always keep the old container + backup and volume available easily and do the stuff without much risk.
On top of all of that, once you are sure of what you want to do, you shall do it on a sandbox first like in local, create a similar env as in prod, potentially simulate incoming traffic and try you strategy like that and see that all goes well. Once you are really confident that things will work fine, then you can proceed.
During the upgrade process you need access to binaries of both versions there are special docker images available that have those e.g. https://github.com/tianon/docker-postgres-upgrade
So you would run the image that has both versions to do the upgrade and after that you can switch back to normal image of the new version.