skip to Main Content

So i have setup a laravel application and hosted on a docker which in turned hosted using AWS ECS Cluster running behind ALB.

So far i have the application up and running as expected, everything runs just the way it is (e.g. Sessions are stored in memcached and working, static assets are in S3 bucket, etc).

Right now i just have 1 problem with stability and i am not quiet sure where exactly the problem is. When i hit my URL / website, sometimes (randomly) it returns 502/503 HTTP error. When this happen i have to wait for about a minute or 2 before the app can return 200 HTTP code.

Here’s a result of doing tail on my docker (i.e. nginx log)

enter image description here

At this point i am totally lost and not sure where else i should check. I’ve tried the following:

  1. Run it locally, with the same docker / nginx >> works just fine.
  2. Run it without ALB (i.e. Using just 1 EC2) >> having similar problem.
  3. Run it using ALB on 2 different EC2 type (i.e. t2.small and micro) >> both having similar problem.
  4. Run it using ALB on just 1 EC2 >> having similar problem.

2

Answers


  1. I have had a similar issue in the past for one of a couple of possible reasons;

    • Health checks configured for the ALB, e.g. the ALB is waiting for the configured number of checks to go green (e.g. every 30 seconds hit an endpoint and wait for a 200 for 4/5 times. During the “unhealthy phase” the instance may be designated offline. This usually happens most often immediately after a restart or deployment or if an instance goes unhealthy.
    • DNS within NGINX. If the DNS records of the downstream service that NGINX is proxying have changed it might be that NGINX has cached (either according to the TTL or for much longer depending on your configuration) the old record and is therefore unable to connect to the downstream.

    To help fully debug, it might be worth determining whether the 502/503 is coming from the ALB or from NGINX. You might be able to determine this from the access log of the ALB or the /var/log/nginx/access|error.log in the container.

    It may also help to check, was there a response body on the response?

    Login or Signup to reply.
  2. According to your logs, ngjnx is answering 401 Unauthorized to the ALB health check request. You have to answer 200 OK in / endpoint or configure a different one like /ping in your ALB target group.

    To check the health of your targets using the console

    1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

    2. On the navigation pane, under LOAD BALANCING, choose Target Groups.

    3. Select the target group.

    4. On the Targets tab, the Status column indicates the status of each target.

    5. If the status is any value other than Healthy, view the tooltip for more information.

    More info: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search