We are doing some test with NGINX as reverse proxy in front of two NGINX sample web servers. The tool being used in our tests is wrk. The web servers’ configuration are very simple. Each of them has a static page (similar to default welcome page) and the NGINX proxy is directing traffic in a round robin fashion. The aim of the test is to measure the impact of different OSes with a NGiNX reverse proxy on the results (We are doing this with CentOS 7, Debian 10 and FreeBSD 12)
In our results, (except FreeBSD) we have a lot of non-2xx or 3xx errors inside:
10 threads and 400 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 74.50ms 221.36ms 1.90s 91.31%
Req/Sec 5.88k 4.56k 16.01k 43.96%
Latency Distribution
50% 4.68ms
75% 7.71ms
90% 196.01ms
99% 1.03s
3509526 requests in 1.00m, 1.11GB read
Socket errors: connect 0, read 0, write 0, timeout 875
Non-2xx or 3xx responses: 3285230
Requests/sec: 58431.20
Transfer/sec: 18.96MB
As you can see, about 90 percent of the responses are in this category.
I’ve tried several different configurations on NGINX logging to "catch" some of these errors. But all I get is 200 OK
in the log. How can I get more information about these responses?
3
Answers
After some research, I was able to track this down with tcpdump on the proxy node like below :
After running wrk on the proxy, I ran tcpdump like below :
And the result - though quite big - had some interesting insights :
The reason I could not see the error in nginx logs is that in reverse proxy mode logging, ngnix will log the results only in debug mode, which, itself, makes the processing so slow that the above error could not surface. Using tcpdump, I could find out what can be the issue inside the packets.
502 means the proxy was not able to connect to the backend. This could be due to resource exhaustion on either the proxy or the backend server. If your CPU is not saturated you are most likely dealing with some artificial kernel limit. I’ve seen file descriptors, TCP connections, accept queues, firewall tracked connections cause this. dmesg sometimes has useful logs.
Usually adding keepalive connections to the backend helps: https://nginx.org/en/docs/http/ngx_http_upstream_module.html#keepalive
Try something like this…