skip to Main Content

I am trying to add a running instance of MinIO to Airflow connections, I thought it should be as easy as this setup in the GUI (never mind the exposed credentials, this is a blocked of environment and will be changed afterwards):
enter image description here

Airflow as well as minio are running in docker containers, which both use the same docker network. Pressing the test button results in the following error:

‘ClientError’ error occurred while testing connection: An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid.

I am curious about what I am missing. The idea was to set up this connection and then use a bucket for data-aware scheduling (= I want to trigger a DAG as soon as someone uploads a file to the bucket)

3

Answers


  1. I am also facing the problem that the endpoint URL refused connection. what I have done is the is actually running in the docker container so we should give docker host url

    {
    "aws_access_key_id":"your_minio_access_key",
    "aws_secret_access_key": "your_minio_secret_key",
    "host": "http://host.docker.internal:9000"
    }

    enter image description here

    Login or Signup to reply.
  2. I am also facing this error in Airflow 2.5.0.
    I’ve found workaround using boto3 library that already buit-in.

    Firsty I created connection with parameters:

    Connection Id: any label (Minio in my case)

    Connection Type: Generic

    Host: minio server ip and port

    Login: Minio access key

    Password: Minio secret key

    And here’s my code:

    import boto3
    from airflow.hooks.base import BaseHook
    
    conn = BaseHook.get_connection('Minio')
    
    s3 = boto3.resource('s3',
                         endpoint_url=conn.host,
                         aws_access_key_id=conn.login,
                         aws_secret_access_key=conn.password
    )
    s3client = s3.meta.client 
    
    #and then you can use boto3 methods for manipulating buckets and files
    #for example:
    
    bucket = s3.Bucket('test-bucket')
    # Iterates through all the objects, doing the pagination for you. Each obj
    # is an ObjectSummary, so it doesn't contain the body. You'll need to call
    # get to get the whole body.
    for obj in bucket.objects.all():
        key = obj.key
    
    Login or Signup to reply.
  3. I was also facing the same issue and was unable to test connection after putting details in web page for create connection. Seems connection works during DAG run but fails during test connection in web UI also found the same mentioned in airflow amazon provider’s Wiki page

    Breaking changes Warning

    In this version of provider Amazon S3 Connection (conn_type="s3")
    removed due to the fact that it was always an alias to AWS connection
    conn_type="aws" In practice the only impact is you won’t be able to
    test the connection in the web UI / API. In order to restore ability
    to test connection you need to change connection type from Amazon S3
    (conn_type="s3") to Amazon Web Services (conn_type="aws") manually.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search