skip to Main Content

I have been struggling with an issue in my Apache Airflow DAG for some time now. I am trying to use the snowflake-connector-python module in my DAG, and it is included in my requirements.txt file. However, when I try to run the DAG, I keep getting the following error message:

`Broken DAG: [/usr/local/airflow/dags/my_dag.py] Traceback (most recent call last):`
`File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed`
`File "/usr/local/airflow/dags/my_dag.py", line 12, in <module>`
`from snowflake.connector.pandas_tools import pd_writer`
`ModuleNotFoundError: No module named 'snowflake'`

I have tried everything that I could think of, including:

Checking that the module is included in the requirements.txt file and that the file is correctly formatted
Restarting the Airflow scheduler and webserver
Clearing the Airflow DAG cache
Checking that the correct Python environment is being used
Installing the module using pip in the correct environment
Despite all of these efforts, the error message persists. Does anyone have any other suggestions for how to resolve this issue? I would greatly appreciate any help or advice. Thank you.

I’ve tried everything and cannot find a solution`

2

Answers


  1. MWAA is a tricky beast, I hope this helps:

    • First check the WebServer log group in CloudWatch, there is a separate requirements_install log stream that will tell you exactly what is installed during service startup. You will find the errors there.

    • This is not well documented, but if you set up a private MWAA WebServer, it won’t have access to internet even for downloading packages. The solution is to install packages from requirements.txt into plugins.zip and upload it to the bucket the MWAA uses.

    • Also make sure to follow up the best practices for MWAA:

    https://docs.aws.amazon.com/mwaa/latest/userguide/working-dags-dependencies.html

    https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html#best-practices-dependencies-python-wheels-s3

    Login or Signup to reply.
  2. Making changes to requirements.txt can be tricky, and this is why I always recommend to developers to first test out using mwaa-local-runner. Using this project, you can use the ./mwaa-local-env test-requirements to validate any changes locally, before you submit up to your MWAA environment.

    If you are having issues, troubleshooting them locally is a lot easier and will normally (in my experience) move you to finding out what you need to change.

    Typically when I need to make changes to the requirements.txt, this is what I do:

    • I take a copy of the constraints for the version of MWAA that I am using (you will see this in the requirements.txt for mwaa-local-runner) and call this something like updated-constraints.txt
    • I update this to update any pinned versions that I might want to use (for example, if I want to be able to use a later version of the Amazon provider package)
    • I deploy this to my S3 Dags folder
    • In my requirements.txt file that I configure within the MWAA environment, I then point to this constraints file using this
     --constraint "/usr/local/airflow/dags/updated-constraints.txt"
    

    (in the above example, my "modified" constraints file that I copied to the S3 Dags folder is called updated-constraints.txt

    • I then add any additional libraries in the requirements.txt
    • Upload the requirements.txt and point to the version within the MWAA environment configuration screen (which will then require a restart)

    You can see how I have done this with some code – https://github.com/094459/cdk-mwaa-redshift

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search