skip to Main Content

I have been trying to run a simple Airflow DAG to show what’s in an s3 bucket but I keep getting this error: ModuleNotFoundError: No module named 'airflow.providers.amazon'

I’ve tried several pip installs recommended in similar questions but still have no luck. Here’s the python script and below is a screenshot of my Airflow webserver showing the error message. Note I’m using Airflow version 2.5.0

import datetime
import logging

from airflow import DAG
from airflow.models import Variable
from airflow.operators.python_operator import PythonOperator
from airflow.hooks.S3_hook import S3Hook

def list_keys():
    hook = S3Hook(aws_conn_id='aws_credentials_old')
    bucket = Variable.get('s3_bucket')
    prefix = Variable.get('s3_prefix')
    logging.info(f"Listing Keys from {bucket}/{prefix}")
    keys = hook.list_keys(bucket, prefix=prefix)
    for key in keys:
        logging.info(f"- s3://{bucket}/{key}")


dag = DAG(
        'lesson1.exercise4',
        start_date=datetime.datetime.now())

list_task = PythonOperator(
    task_id="list_keys",
    python_callable=list_keys,
    dag=dag
)

enter image description here

2

Answers


  1. You can try installing the backport-providers-amazon package because it’s only available in the Airflow main branch.

    pip install apache-airflow-backport-providers-amazon
    

    Here you can find more info. https://pypi.org/project/apache-airflow-backport-providers-amazon/

    Login or Signup to reply.
  2. You are importing from the wrong place. It should be

    from airflow.providers.amazon.aws.hooks.s3 import S3Hook
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search