skip to Main Content

I am trying to get DockerOperator work with Airflow on my Mac. I am running Airflow based on Puckel with small modifications.

Dockerfile build as puckel-airflow-with-docker-inside:

FROM puckel/docker-airflow:latest

USER root
RUN groupadd --gid 999 docker 
&& usermod -aG docker airflow
USER airflow

docker-compose-CeleryExecutor.yml.:

version: '2.1'

services:
    redis:
        image: 'redis:5.0.5'

    postgres:
        image: postgres:9.6
        environment:
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
    webserver:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - postgres
            - redis
        environment:
            - LOAD_EX=n
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery
        volumes:
            - ./requirements.txt:/requirements.txt
            - ./dags:/usr/local/airflow/dags
        ports:
            - "8080:8080"
        command: webserver
        healthcheck:
            test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
            interval: 30s
            timeout: 30s
            retries: 3

    flower:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - redis
        environment:
            - EXECUTOR=Celery
        ports:
            - "5555:5555"
        command: flower

    scheduler:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - webserver
        volumes:
            - ./dags:/usr/local/airflow/dags
            - ./requirements.txt:/requirements.txt
        environment:
            - LOAD_EX=n
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery

        command: scheduler

    worker:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - scheduler
        volumes:
          - ./dags:/usr/local/airflow/dags
          - ./requirements.txt:/requirements.txt
        environment:
          - DOCKER_HOST=tcp://socat:2375
          - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
          - EXECUTOR=Celery
        command: worker
    socat:
        image: bpack/socat
        command: TCP4-LISTEN:2375,fork,reuseaddr UNIX-CONNECT:/var/run/docker.sock
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock
        expose:
          - "2375"

Task/Operation Definition in the DAG:

DockerOperator(
    task_id='docker_command',
    image='centos:latest',
    api_version='auto',
    auto_remove=True,
    command="/bin/sleep 30",
    docker_url="unix://var/run/docker.sock",
    network_mode="bridge",
    dag=dag
)

Full Error-Log for the Docker-Task after triggering the DAG:

*** Log file does not exist: /usr/local/airflow/logs/tutorial/docker_command/2020-04-13T11:20:41.323461+00:00/1.log
*** Fetching from: http://6f57f4c44662:8793/log/tutorial/docker_command/2020-04-13T11:20:41.323461+00:00/1.log

[2020-04-13 11:20:47,627] {{taskinstance.py:655}} INFO - Dependencies all met for <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [queued]>
[2020-04-13 11:20:47,648] {{taskinstance.py:655}} INFO - Dependencies all met for <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [queued]>
[2020-04-13 11:20:47,648] {{taskinstance.py:866}} INFO - 
--------------------------------------------------------------------------------
[2020-04-13 11:20:47,648] {{taskinstance.py:867}} INFO - Starting attempt 1 of 2
[2020-04-13 11:20:47,648] {{taskinstance.py:868}} INFO - 
--------------------------------------------------------------------------------
[2020-04-13 11:20:47,660] {{taskinstance.py:887}} INFO - Executing <Task(DockerOperator): docker_command> on 2020-04-13T11:20:41.323461+00:00
[2020-04-13 11:20:47,663] {{standard_task_runner.py:53}} INFO - Started process 53 to run task
[2020-04-13 11:20:47,729] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [running]> 6f57f4c44662
[2020-04-13 11:20:47,758] {{taskinstance.py:1128}} ERROR - Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/transport/unixconn.py", line 42, in connect
    sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/transport/unixconn.py", line 42, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 202, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 225, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 966, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/docker_operator.py", line 262, in execute
    tls=tls_config
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 185, in __init__
    self._version = self._retrieve_server_version()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 210, in _retrieve_server_version
    'Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
[2020-04-13 11:20:47,765] {{taskinstance.py:1151}} INFO - Marking task as UP_FOR_RETRY
[2020-04-13 11:20:57,585] {{logging_mixin.py:112}} INFO - [2020-04-13 11:20:57,584] {{local_task_job.py:103}} INFO - Task exited with return code 1

I can’t get it to work :/. Am I maybe adding – /var/run/docker.sock:/var/run/docker.sock the wrong way?

Thank you!

6

Answers


  1. I had the same problem on Linux, and thanks to How to mount docker socket as volume in docker container with correct group I solved it. Maybe my solution will help you.

    I have next permissions on docker.sock:

    srw-rw---- 1 root docker docker.sock
    

    Dockerfile:

    FROM puckel/docker-airflow:latest
    USER root
    ARG DOCKER_GROUP_ID
    # Install Docker
    RUN pip install 'Docker==4.2.0'
    # Add permissions for running docker.sock
    RUN groupadd -g $DOCKER_GROUP_ID docker && gpasswd -a airflow docker
    USER airflow
    

    Build Image with command:

    docker build --rm --build-arg DOCKER_GROUP_ID=`getent group docker | cut -d: -f3` -t docker-airflow .
    

    And run Container with:

    docker run -d -p 8080:8080 -v /var/run/docker.sock://var/run/docker.sock -v /path/to/dags/on/your/local/machine/:/usr/local/airflow/dags docker-airflow webserver
    
    Login or Signup to reply.
  2. In my case the problem was that the docker service was not started.

    I’m on OpenSUSE and systemctl status docker.service was showing it’s inactive.

    So – after I ran sudo systemctl start docker.service – and the status command showed it’s active and running – then docker-compose ran successfully.

    Login or Signup to reply.
  3. Found a elegant solution in the following link:

    https://onedevblog.com/how-to-fix-a-permission-denied-when-using-dockeroperator-in-airflow/

    From the link:

    There is a more elegant approach which consists of “wrapping” the file around a service (accessible via TCP).

    Login or Signup to reply.
  4. I faced the same problem. My setup for Airflow(airflow:2.2.2-python3.8) was with docker-compose, Ubuntu 20.04.

    I was using docker task decorator. Below are the steps I did to resolve the error.

    Enabling TCP port 2375 for external connection to Docker

    1. Create daemon.json file in /etc/docker:
     {"hosts": ["tcp://0.0.0.0:2375", "unix:///var/run/docker.sock"]}
    
    1. Add /etc/systemd/system/docker.service.d/override.conf
     [Service]
     ExecStart=
     ExecStart=/usr/bin/dockerd
    
    1. Reload the systemd daemon:
     systemctl daemon-reload
    
    1. Restart docker:
     systemctl restart docker.service
    
    1. Adding docker_url parameter in decorator
    @task.docker(
        image="custom-image",
        multiple_outputs=True,
        do_xcom_push=False,
        docker_url="tcp://host-ip:2375",
        mount_tmp_dir=False,
        mounts=[
        Mount(source="host/path/directory",target="container/path/directory",type="bind")
        ] 
    )
    def download_data():
        ...
    

    Note:
    The setup provides un-encrypted and un-authenticated direct access to the Docker daemon – and should be secured either using the built-in HTTPS encrypted socket, or by putting a secure web proxy in front of it.

    Login or Signup to reply.
  5. For me the following approach worked to get it run on my local machine:
    I took the official docker-compose.yaml from here:
    https://github.com/apache/airflow/blob/main/docs/apache-airflow/start/docker-compose.yaml

    In x-airflow-common:/volumes, I added:

    - /var/run/docker.sock:/var/run/docker.sock
    

    In x-airflow-common:/user, I changed the value to

    user: root
    

    Starting Airflow with

    docker-compose up airflow-init
    docker-compose up
    

    and the DAG with an DockerOperator runs through

    Login or Signup to reply.
  6. Maybe my problem was a bit more generic "Run a docker command inside a docker container", but I was getting the same error as in the subject line. What worked for me was this answer: https://forums.docker.com/t/how-can-i-run-docker-command-inside-a-docker-container/337/2

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search