My project uses nats-streaming cluster. I want to raise it on three servers using docker swarm, but only 2 are successfully raised.
(docker service ls output)
A non-working node outputs the following log:
here
docker-stack.yml:
version: "3.8"
networks:
network:
driver: overlay
attachable: true
services:
nats-streaming-1:
logging:
options:
max-size: "100m"
command:
- "-sc"
- "/etc/stan.conf"
- "--cluster"
- "nats://0.0.0.0:6222"
- "--cluster_id"
- $NATS_CLUSTER_NAME
- "--clustered"
- "--cluster_bootstrap"
- "--cluster_log_path"
- /data/log
- "--cluster_node_id"
- nats-streaming-1
- "--cluster_raft_logging"
- "--debug"
- "--dir"
- /data/msg
- "--http_port"
- "8222"
- "--port"
- "4222"
- "--store"
- file
- "--stan_debug"
- "--hb_interval"
- $NATS_HB_INTERVAL
- "--hb_fail_count"
- "$NATS_HB_FAIL_COUNT"
- "--hb_timeout"
- "$NATS_HB_TIMEOUT"
- "-mc"
- "$NATS_STAN_MAX_CHNS"
- "-mm"
- "$NATS_STAN_MAX_MSGS"
- "-mb"
- "$NATS_MAX_BYTES"
image: "nats-streaming:0.25.5"
networks:
network:
ports:
- "$NATS_PORT1:4222"
- "$NATS_HTTP_PORT1:8222"
volumes:
- "nats-streaming-1:/data"
- "./stan.conf:/etc/stan.conf"
deploy:
placement:
max_replicas_per_node: 1
constraints: [node.hostname == alex-swarm-3a]
nats-streaming-2:
logging:
options:
max-size: "100m"
command:
- "-sc"
- "/etc/stan.conf"
- "--cluster"
- "nats://0.0.0.0:6222"
- "--cluster_id"
- $NATS_CLUSTER_NAME
- "--clustered"
- "--cluster_log_path"
- /data/log
- "--cluster_node_id"
- nats-streaming-2
- "--cluster_raft_logging"
- "--debug"
- "--dir"
- /data/msg
- "--http_port"
- "8222"
- "--port"
- "4222"
- "--store"
- file
- "--stan_debug"
- "--routes"
- "nats://nats-streaming-1:6222"
- "--hb_interval"
- $NATS_HB_INTERVAL
- "--hb_fail_count"
- "$NATS_HB_FAIL_COUNT"
- "--hb_timeout"
- "$NATS_HB_TIMEOUT"
- "-mc"
- "$NATS_STAN_MAX_CHNS"
- "-mm"
- "$NATS_STAN_MAX_MSGS"
- "-mb"
- "$NATS_MAX_BYTES"
image: "nats-streaming:0.25.5"
ports:
- "$NATS_PORT2:4222"
- "$NATS_HTTP_PORT2:8222"
volumes:
- "nats-streaming-2:/data"
- "/home/master/stan.conf:/etc/stan.conf"
networks:
network:
deploy:
placement:
max_replicas_per_node: 1
constraints: [node.hostname == alex-swarm-3b]
nats-streaming-3:
logging:
options:
max-size: "100m"
command:
- "-sc"
- "/etc/stan.conf"
- "--cluster"
- "nats://0.0.0.0:6222"
- "--cluster_id"
- $NATS_CLUSTER_NAME
- "--clustered"
- "--cluster_log_path"
- /data/log
- "--cluster_node_id"
- nats-streaming-3
- "--cluster_raft_logging"
- "--debug"
- "--dir"
- /data/msg
- "--http_port"
- "8222"
- "--port"
- "4222"
- "--store"
- file
- "--stan_debug"
- "--routes"
- "nats://nats-streaming-1:6222"
- "--hb_interval"
- $NATS_HB_INTERVAL
- "--hb_fail_count"
- "$NATS_HB_FAIL_COUNT"
- "--hb_timeout"
- "$NATS_HB_TIMEOUT"
- "-mc"
- "$NATS_STAN_MAX_CHNS"
- "-mm"
- "$NATS_STAN_MAX_MSGS"
- "-mb"
- "$NATS_MAX_BYTES"
image: "nats-streaming:0.25.5"
networks:
- network
ports:
- "$NATS_PORT3:4222"
- "$NATS_HTTP_PORT3:8222"
volumes:
- "nats-streaming-3:/data"
- "/home/master/stan.conf:/etc/stan.conf"
deploy:
placement:
max_replicas_per_node: 1
constraints: [node.hostname == alex-swarm-3c]
volumes:
nats-streaming-1:
nats-streaming-2:
nats-streaming-3:
The documentation says the following: "This is because the server recovered the streaming state (as pointed by -dir and located in the mounted volume), but did not recover the RAFT specific state that is by default stored in a directory named after your cluster id, relative to the current directory starting the executable. In the context of a container, this data will be lost after the container is stopped."
I assigned volume to the data folder and assigned –cluster_log_path to ‘/data/log’, but it didn’t help
2
Answers
The problem was that along with the NATS, many other services started and began writing data to it before all the copies were up.
Do know that NATS Streaming is end-of-life. A persistence layer is now available in NATS, called JetStream: https://docs.nats.io/nats-concepts/jetstream. Here is a webinar that discusses the transition: https://youtu.be/yKI9YmLx_8A and here is a migration example: https://natsbyexample.com/examples/operations/stan2js/cli