I am trying to ssh into my own AWS MWAA instance in order to install some system dependencies. I’m coming from GCP so this is a bit different for me.
I can’t find the exact EC2 it is hosted on. Or derive the the IP for some reason. And I don’t think SSH’ing into the VPC is going to help.
I haven’t been able to find much about this in the documentation.
Could anyone provide guidance?
4
Answers
I ended up calling a lambda function with a ECR image. It seemed like the path of least resistance.
There is a mention in the MWAA FAQ’s about SSH into the environment.
https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-faqs.html#ssh-dag
To install dependencies in MWAA, here is the link you need https://docs.aws.amazon.com/mwaa/latest/userguide/connections-packages.html
If you are trying to install dependencies to connect to a downstream service, and you can’t use one of the two options in the first answer you can look at using MWAA container orchestrators and package up everything you need in a container image. This is what I do when I want to install large binaries or non Python code.
update 4th April
You can now use the startup script which might help you. Check out the documentation https://docs.aws.amazon.com/mwaa/latest/userguide/using-startup-script.html
SSH into the instance is not supported by MWAA. Part of the whole "managed" thing is that the instances are hidden from you by AWS.
You can do hacks like running stuff in
BashOperator
but they will not work well.BashOperator
runs on whatever worker gets assigned that task, which is unpredictable. You can useBashOperator
totouch foo
, and then use it again tols foo
, and you will see nothing because the first run was on a different worker than the second. There is not a practical way to use these hacks to actually install dependencies on Airflow.MWAA allows you to provide a
requirements.txt
file containing additional Python packages you want it to install. That’s all you get.If you need more advanced dependencies, I recommend you containerize your job, and then run it with something like
DockerOperator
.