skip to Main Content

I have a PHP project, and the Nginx configuration before releasing a new version is:

        location ~ .php$ {
            root   /data/web/php-project-v1.0.0;
            fastcgi_pass   127.0.0.1:9000;
            fastcgi_index  index.php;
            include        fastcgi_params;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            fastcgi_param  SCRIPT_NAME      $fastcgi_script_name;
        }

The server is still handling a large volume of requests. At this point, I modified the Nginx configuration file.

        location ~ .php$ {
            root   /data/web/php-project-v1.1.0;
            fastcgi_pass   127.0.0.1:9000;
            fastcgi_index  index.php;
            include        fastcgi_params;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            fastcgi_param  SCRIPT_NAME      $fastcgi_script_name;
        }

Please note that I have pointed the root directive to a new directory. Then I executed: nginx -s reload.

Can this achieve zero-downtime deployment? Advantages, disadvantages, and points to pay attention to, especially during high traffic periods.

I have tested this approach, and it at least doesn’t cause server-side 500 errors. After modifying the configuration, requests that were not yet completed still return results based on the old project’s logic once their processing finishes. However, from my observation, this change doesn’t take effect immediately after executing nginx -s reload. It seems the logic of the old project persists for a while. I hope developers familiar with Nginx can explain this phenomenon and answer my question from a deeper, underlying perspective. Additionally, while searching for this issue on Google, I noticed that very few people use this method. Why don’t more people adopt this technique for automated deployment, especially in scenarios where there aren’t many redundant hosts? Is it because they haven’t thought of it, or are there potential risks involved?

2

Answers


  1. Can this achieve zero-downtime deployment?

    Yep, it can! I’ve done it before. My setup was similar to yours.

    At a company I used to work at we did zero-downtime deployments using Deployer (PHP) which would create a new directory (usually 1, 2, 3, 4, etc.), copy the code in, then update the live/ symlink to point to the new version. Then we would run systemctl nginx reload and then we would run a custom script that would clear the opcache.

    After modifying the configuration, requests that were not yet completed still return results based on the old project’s logic once their processing finishes. However, from my observation, this change doesn’t take effect immediately after executing nginx -s reload. It seems the logic of the old project persists for a while.

    The opcache still has the old code loaded and it will take some time for it to expire depending on the settings (opcache.revalidate_freq and opcache.validate_timestamps). Despite messing with those though, it may still not update as expected.

    This method is common, but I have two issues: On macOS Sequoia 15.2, symlinks don’t seem to work with Nginx: Set root in Nginx config to a symlink path. Updated the symlink with ln -sfT to point to a different project. Reloaded Nginx (nginx -s reload), but it still serves the old PHP code. This works on Linux (used by my DevOps colleague), but could it cause issues under high traffic or edge cases?

    There is an "issue" with the opcache where it doesn’t pick up the updated files despite clearing the cache, but there is a workaround by using $realpath_root instead of $document_root (https://stackoverflow.com/a/23904770/6055465, you can also see an example script in the question for clearing the opcache). I don’t fully understand how that all worked and don’t have access to the code as it has been a few years and I no longer work for that company.

    Why don’t more people adopt this technique for automated deployment, especially in scenarios where there aren’t many redundant hosts? Is it because they haven’t thought of it, or are there potential risks involved?

    I think the main reason for it not being common is because companies that require zero downtime deployments will have only one or two people figure it out, implement it once, and then it is never touched or taught again.

    Or they use a different method of zero downtime deployments such as using containers + kubernetes. Or they use load balancers and take each host out of the rotation, let connections finish, make updates, add back into the rotation. These two methods are arguably superior because you can update the operating system and other things without downtime rather than just the application.

    Advantages, disadvantages, and points to pay attention to, especially during high traffic periods.

    • It is complicated, not well documented, and requires trial and error (I was planning on writing a blog post at one point explaining it, but never got around to it)
    • When you reset the opcache, your CPU usage will spike with heavy traffic for a couple minutes. My company dealt with 20-30% CPU load spikes. You need to be able to absorb that, or figure out how to preload the cache.
    • You do get zero-downtime deployments which is wonderful!

    A little long, but hopefully that helps you achieve your goal and also gives some explanations to your "why" questions.

    Login or Signup to reply.
  2. use 2 nodes and balance requests between them so old requests continue executing in old root and new ones will go to a new one without breaking old sessions

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search