I’ve setup a Docker Swarm with Traefik v2 as the reverse proxy, and have been able to access the dashboard with no issues.
I am having an issue where I cannot get a response from any service that runs on a different node to the node Traefik is running on. I’m been testing and researching and presuming it’s a network issue of some type.
I’ve done some quick testing with a empty Nginx image and was able to deploy another stack and get a response if the image was on the same node. Other stacks on the swarm which deploy across multiple nodes (but not including the Traefik node) are able to communicate to each other without issues).
Here is the test stack to provide some context of what I was using.
version: '3.8'
services:
test:
image: nginx:latest
deploy:
replicas: 1
placement:
constraints:
- node.role==worker
labels:
- "traefik.enable=true"
- "traefik.docker.network=uccser-dev-public"
- "traefik.http.services.test.loadbalancer.server.port=80"
- "traefik.http.routers.test.service=test"
- "traefik.http.routers.test.rule=Host(`TEST DOMAIN`) && PathPrefix(`/test`)"
- "traefik.http.routers.test.entryPoints=web"
networks:
- uccser-dev-public
networks:
uccser-dev-public:
external: true
The uccser-dev-public
network is an overlay network across all nodes, with no encryption.
If I added a constraint to specify the Traefik node, then the requests worked with no issues. However, if I switched it to a different node, I get the Traefik 404 page.
The Traefik dashboard is showing it sees the service.
However the access logs show the following:
proxy_traefik.1.6fbx58k4n3fj@SWARM_NODE | IP_ADDRESS - - [21/Jul/2021:09:03:02 +0000] "GET / HTTP/2.0" - - "-" "-" 1430 "-" "-" 0ms
It’s just blank, and I don’t know where to proceed from here. The normal log shows no errors that I can see.
Traefik stack file:
version: '3.8'
x-default-opts:
&default-opts
logging:
options:
max-size: '1m'
max-file: '3'
services:
# Custom proxy to secure docker socket for Traefik
docker-socket:
<<: *default-opts
image: tecnativa/docker-socket-proxy
networks:
- traefik-docker
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
NETWORKS: 1
SERVICES: 1
SWARM: 1
TASKS: 1
deploy:
placement:
constraints:
- node.role == manager
# Reverse proxy for handling requests
traefik:
<<: *default-opts
image: traefik:2.4.11
networks:
- uccser-dev-public
- traefik-docker
volumes:
- traefik-public-certificates:/etc/traefik/acme/
ports:
- target: 80 # HTTP
published: 80
protocol: tcp
mode: host
- target: 443 # HTTPS
published: 443
protocol: tcp
mode: host
command:
# Docker
- --providers.docker
- --providers.docker.swarmmode
- --providers.docker.endpoint=tcp://docker-socket:2375
- --providers.docker.exposedByDefault=false
- --providers.docker.network=uccser-dev-public
- --providers.docker.watch
- --api
- --api.dashboard
- --entryPoints.web.address=:80
- --entryPoints.websecure.address=:443
- --log.level=DEBUG
- --global.sendAnonymousUsage=false
deploy:
placement:
constraints:
- node.role==worker
# Dynamic Configuration
labels:
- "traefik.enable=true"
- "traefik.http.routers.dashboard.rule=Host(`SWARM_NODE`) && (PathPrefix(`/api`) || PathPrefix(`/dashboard`))"
- "traefik.http.routers.dashboard.service=api@internal"
- "traefik.http.services.dummy-svc.loadbalancer.server.port=9999" # Dummy service for Swarm port detection. The port can be any valid integer value.
volumes:
traefik-public-certificates: {}
networks:
# This network is used by other services
# to connect to the proxy.
uccser-dev-public:
external: true
# This network is used for Traefik to talk to
# the Docker socket.
traefik-docker:
driver: overlay
driver_opts:
encrypted: 'true'
Any ideas?
2
Answers
Further testing showed other services were working on different nodes, so figured it must be an issue with my application. Turns out my Django application still had a bunch of settings configured for it's previous hosting location regarding HTTPS. As it wasn't passing the required settings it had denied the requests before the were processed. I needed to have the logging level for gunicorn (WSGI) lower to see more information too.
In summary, Traefik and Swarm were fine.
Another reason for this can be that Docker Swarm ports haven’t been opened on all of the nodes. If you’re using UFW that means running the following on every machine participating in the swarm: