skip to Main Content

OK, I am probably very stupid but anyways;
How can I install additional pip packages via the docker-compose file of airflow?

I am assuming that their should be a standard functionality to pick up a requirements.txt or something. When inspecting their repo, I do see some ENV variables like ADDITIONAL_PYTHON_DEPS that hint me that this should be possible, but setting these in the docker-compose file doesn’t actually install the library’s.

version: '3'
x-airflow-common:
  &airflow-common
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
    AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
    AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'
    ADDITIONAL_PYTHON_DEPS: python-bitvavo-api

volumes:
    - ./dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
    - ./plugins:/opt/airflow/plugins
    - ./requirements.txt:/requirements.txt

Obviously my docker experience is very limited but what am I missing?

2

Answers


  1. There is a pretty detailed guide on how to achieve what you are looking for on the Airflow docs here. Depending on your requirements, this may be as easy as extending the original image using a From directive while creating a new Dockerfile, or you may need to customize the image to suit your needs.

    If you go with the Extending the image approach your new Dockerfile will be something like this:

    FROM apache/airflow:2.0.1
    USER root
    RUN apt-get update 
      && apt-get install -y --no-install-recommends 
             build-essential my-awesome-apt-dependency-to-add 
      && apt-get autoremove -yqq --purge 
      && apt-get clean 
      && rm -rf /var/lib/apt/lists/*
    USER airflow
    RUN pip install --no-cache-dir --user my-awesome-pip-dependency-to-add
    

    Then you could just add something like these to the docker-compose file:

    ...
    version: "3"
    x-airflow-common: &airflow-common
      build: . # this is optional
      image: ${AIRFLOW_IMAGE_NAME:-the_name_of_your_extended_image
      ...
    ...
    

    Finally, build your image and turn everything back on using compose. Try the docs for details or a full explanation. Hope that works for you!

    Login or Signup to reply.
  2. Extending the image could be one way. Another way is adding the package in the docker compose.

    For example, you want to pip apache-airflow-providers-apache-hdfs.
    Then you go to docker compose file,

    x-airflow-common:
      &airflow-common
      image: airflow_melodie1:test
      environment:
        &airflow-common-env
       ......
        _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-apache-hdfs other_packages}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search