skip to Main Content

I am trying to ssh into my own AWS MWAA instance in order to install some system dependencies. I’m coming from GCP so this is a bit different for me.

I can’t find the exact EC2 it is hosted on. Or derive the the IP for some reason. And I don’t think SSH’ing into the VPC is going to help.

I haven’t been able to find much about this in the documentation.

Could anyone provide guidance?

4

Answers


  1. Chosen as BEST ANSWER

    I ended up calling a lambda function with a ECR image. It seemed like the path of least resistance.


  2. There is a mention in the MWAA FAQ’s about SSH into the environment.

    https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-faqs.html#ssh-dag

    To install dependencies in MWAA, here is the link you need https://docs.aws.amazon.com/mwaa/latest/userguide/connections-packages.html

    Login or Signup to reply.
  3. If you are trying to install dependencies to connect to a downstream service, and you can’t use one of the two options in the first answer you can look at using MWAA container orchestrators and package up everything you need in a container image. This is what I do when I want to install large binaries or non Python code.

    update 4th April

    You can now use the startup script which might help you. Check out the documentation https://docs.aws.amazon.com/mwaa/latest/userguide/using-startup-script.html

    Login or Signup to reply.
  4. SSH into the instance is not supported by MWAA. Part of the whole "managed" thing is that the instances are hidden from you by AWS.

    You can do hacks like running stuff in BashOperator but they will not work well. BashOperator runs on whatever worker gets assigned that task, which is unpredictable. You can use BashOperator to touch foo, and then use it again to ls foo, and you will see nothing because the first run was on a different worker than the second. There is not a practical way to use these hacks to actually install dependencies on Airflow.

    MWAA allows you to provide a requirements.txt file containing additional Python packages you want it to install. That’s all you get.

    If you need more advanced dependencies, I recommend you containerize your job, and then run it with something like DockerOperator.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search