I have been trying to run a simple Airflow DAG to show what’s in an s3 bucket but I keep getting this error: ModuleNotFoundError: No module named 'airflow.providers.amazon'
I’ve tried several pip installs recommended in similar questions but still have no luck. Here’s the python script and below is a screenshot of my Airflow webserver showing the error message. Note I’m using Airflow version 2.5.0
import datetime
import logging
from airflow import DAG
from airflow.models import Variable
from airflow.operators.python_operator import PythonOperator
from airflow.hooks.S3_hook import S3Hook
def list_keys():
hook = S3Hook(aws_conn_id='aws_credentials_old')
bucket = Variable.get('s3_bucket')
prefix = Variable.get('s3_prefix')
logging.info(f"Listing Keys from {bucket}/{prefix}")
keys = hook.list_keys(bucket, prefix=prefix)
for key in keys:
logging.info(f"- s3://{bucket}/{key}")
dag = DAG(
'lesson1.exercise4',
start_date=datetime.datetime.now())
list_task = PythonOperator(
task_id="list_keys",
python_callable=list_keys,
dag=dag
)
2
Answers
You can try installing the backport-providers-amazon package because it’s only available in the Airflow main branch.
Here you can find more info. https://pypi.org/project/apache-airflow-backport-providers-amazon/
You are importing from the wrong place. It should be