I am using Airflow and PostgreSQL in Docker.
So I set up a PostgreSQL database on port 5433. Container (384eaa7b6efb). This is where I have my data which I want to fetch with my dag in Airflow.
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
PORTS NAMES
384eaa7b6efb postgres "docker-entrypoint.s…" 4 hours ago Up 4 hours
0.0.0.0:5433->5432/tcp test-instance
421d443540fb apache/airflow:2.0.1 "/usr/bin/dumb-init …" 18 hours ago Up 4 hours (healthy)
0.0.0.0:8080->8080/tcp airflow_docker-airflow-webserver-1
ff4bea4f16dd apache/airflow:2.0.1 "/usr/bin/dumb-init …" 18 hours ago Up 4 hours
8080/tcp airflow_docker-airflow-scheduler-1
4cead3ee6667 postgres:13 "docker-entrypoint.s…" 18 hours ago Up 4 hours (healthy)
5432/tcp airflow_docker-postgres-1
8bb2cefd456e apache/airflow:2.2.0 "/usr/bin/dumb-init …" 4 days ago Up 5 hours (healthy)
0.0.0.0:5555->5555/tcp, 8080/tcp airflow_docker-flower-1
5c4b96d9c5a0 apache/airflow:2.2.0 "/usr/bin/dumb-init …" 4 days ago Up 5 hours (healthy)
8080/tcp airflow_docker-airflow-worker-1
3442eae78844 apache/airflow:2.2.0 "/usr/bin/dumb-init …" 4 days ago Restarting (1) 11 seconds ago airflow_docker-airflow-triggerer-1
7e945051435f redis:latest "docker-entrypoint.s…" 4 days ago Up 5 hours (healthy)
6379/tcp airflow_docker-redis-1
I created a new server connection on PgAdmin to access my database:
I can see my data and everything is fine.
So now I created this dag to query the data from that database:
def queryPostgresql():
conn_string="dbname='postgres' host='localhost' port='5433' user='postgres' password='admin'"
conn=db.connect(conn_string)
df=pd.read_sql("select name,city from public.users",conn)
df.to_csv('postgresqldata.csv')
print("-------Data Saved------")
when i run the dag in airflow,
[2021-10-17 14:37:16,485] {taskinstance.py:1455} ERROR - could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5433?
What I am doing wrong here? Is it the way i should add a connection?
2
Answers
Short Answer
Change the host to;
host.docker.internal
.Long Answer
This depends on the Os you are using. In order to access the host’s network from within a container you will need to use the host’s IP address in the docker. Conveniently, on Windows and Max this is resolved using the domain
host.docker.internal
from within the container. As specified in docker’s documentation:There is also a workaround for this in linux which has been answered in
What is linux equivalent of "host.docker.internal"
I have almost the same configurations. If all your containers are within the same network, you can access your database by
<postgres-container-name>:<port>
(in your casetest-instance:5433
)