OK, I am probably very stupid but anyways;
How can I install additional pip packages via the docker-compose file of airflow?
I am assuming that their should be a standard functionality to pick up a requirements.txt
or something. When inspecting their repo, I do see some ENV variables like ADDITIONAL_PYTHON_DEPS
that hint me that this should be possible, but setting these in the docker-compose file doesn’t actually install the library’s.
version: '3'
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'
ADDITIONAL_PYTHON_DEPS: python-bitvavo-api
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
- ./requirements.txt:/requirements.txt
Obviously my docker experience is very limited but what am I missing?
2
Answers
There is a pretty detailed guide on how to achieve what you are looking for on the Airflow docs here. Depending on your requirements, this may be as easy as extending the original image using a
From
directive while creating a new Dockerfile, or you may need to customize the image to suit your needs.If you go with the Extending the image approach your new Dockerfile will be something like this:
Then you could just add something like these to the docker-compose file:
Finally, build your image and turn everything back on using compose. Try the docs for details or a full explanation. Hope that works for you!
Extending the image could be one way. Another way is adding the package in the docker compose.
For example, you want to pip apache-airflow-providers-apache-hdfs.
Then you go to docker compose file,