I have a PHP project, and the Nginx configuration before releasing a new version is:
location ~ .php$ {
root /data/web/php-project-v1.0.0;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param SCRIPT_NAME $fastcgi_script_name;
}
The server is still handling a large volume of requests. At this point, I modified the Nginx configuration file.
location ~ .php$ {
root /data/web/php-project-v1.1.0;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param SCRIPT_NAME $fastcgi_script_name;
}
Please note that I have pointed the root directive to a new directory. Then I executed: nginx -s reload
.
Can this achieve zero-downtime deployment? Advantages, disadvantages, and points to pay attention to, especially during high traffic periods.
I have tested this approach, and it at least doesn’t cause server-side 500 errors. After modifying the configuration, requests that were not yet completed still return results based on the old project’s logic once their processing finishes. However, from my observation, this change doesn’t take effect immediately after executing nginx -s reload
. It seems the logic of the old project persists for a while. I hope developers familiar with Nginx can explain this phenomenon and answer my question from a deeper, underlying perspective. Additionally, while searching for this issue on Google, I noticed that very few people use this method. Why don’t more people adopt this technique for automated deployment, especially in scenarios where there aren’t many redundant hosts? Is it because they haven’t thought of it, or are there potential risks involved?
2
Answers
Yep, it can! I’ve done it before. My setup was similar to yours.
At a company I used to work at we did zero-downtime deployments using Deployer (PHP) which would create a new directory (usually
1
,2
,3
,4
, etc.), copy the code in, then update thelive/
symlink to point to the new version. Then we would runsystemctl nginx reload
and then we would run a custom script that would clear the opcache.The opcache still has the old code loaded and it will take some time for it to expire depending on the settings (opcache.revalidate_freq and opcache.validate_timestamps). Despite messing with those though, it may still not update as expected.
There is an "issue" with the opcache where it doesn’t pick up the updated files despite clearing the cache, but there is a workaround by using
$realpath_root
instead of$document_root
(https://stackoverflow.com/a/23904770/6055465, you can also see an example script in the question for clearing the opcache). I don’t fully understand how that all worked and don’t have access to the code as it has been a few years and I no longer work for that company.I think the main reason for it not being common is because companies that require zero downtime deployments will have only one or two people figure it out, implement it once, and then it is never touched or taught again.
Or they use a different method of zero downtime deployments such as using containers + kubernetes. Or they use load balancers and take each host out of the rotation, let connections finish, make updates, add back into the rotation. These two methods are arguably superior because you can update the operating system and other things without downtime rather than just the application.
A little long, but hopefully that helps you achieve your goal and also gives some explanations to your "why" questions.
use 2 nodes and balance requests between them so old requests continue executing in old root and new ones will go to a new one without breaking old sessions