We’re running a Flask app exposing data stored in a database. It returns a lot of 503
errors. My understanding is that those are generated by apache when the maximum number of concurrent threads is reached.
The root cause is most probably the app performing poorly but at this stage, we can’t afford much more development time, so I’m looking for a cheap deployment config hack to mitigate the issue.
-
Data providers are sending data at a high rate. I believe their program gets a lot of
503
and just try/catch those to retry until success. -
Data consumers use the app at a much lower rate and I’d like them not to be bothered so much by those issues.
I’m thinking of limiting the number of concurrent accesses from the IP of each provider. They may get a lower throughput but they’d live with it as they already do, and it would make life easier for casual consumers.
I identified the mod_limitipconn which seems to be taylored for this.
mod_limitipconn […] allows administrators to limit the number of simultaneous requests permitted from a single IP address.
I’d like to be sure I understand how it works and how the limits are set.
I always figured there were a maximum of 5 simultaneous connection due to the WSGI settings: threads=5
. But I read Processes and Threading in mod_wsgi docs and I’m confused.
Considering the configuration below, are those assumptions correct?
-
Only one instance of the application is running at a time.
-
A maximum of 5 concurrent threads can be spawned.
-
When 5 requests are being treated, if a sixth request arrives, the client gets a
503
. -
Limiting the number of simultaneous requests for IP x.x.x.x. at apache level to 3 would ensure than only 3 of those 5 threads can be used by that IP, leaving 2 to the other IPs.
-
Raising the number of threads in WSGI config could help share the connection pool amongst clients by providing more granularity in the rate limits (you can limit to 3 for each of 4 providers and keep 5 more with a total of 17) but would not improve the overall performance, even if the server has idle cores, because the Python GIL prevents several threads to run at the same time.
-
Raising the number of threads to a high number like 100 may make the requests longer but would limit
503
responses. It might even be enough if the clients set their own concurrent requests limit not too high and if they don’t, I can enforce that with something likemod_limitipconn
. -
Raising the number of threads too much would make the requests so long that the clients would get timeouts instead of
503
which is not really better.
Current config below. Not sure what matters.
apachectl -V
:
Server version: Apache/2.4.25 (Debian)
Server built: 2018-06-02T08:01:13
Server's Module Magic Number: 20120211:68
Server loaded: APR 1.5.2, APR-UTIL 1.5.4
Compiled using: APR 1.5.2, APR-UTIL 1.5.4
Architecture: 64-bit
Server MPM: event
threaded: yes (fixed thread count)
forked: yes (variable process count)
/etc/apache2/apache2.conf
:
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On
#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100
/etc/apache2/mods-available/mpm_worker.conf
(but that shouldn’t matter in event
more, right?):
<IfModule mpm_worker_module>
StartServers 2
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxRequestWorkers 150
MaxConnectionsPerChild 0
</IfModule>
/etc/apache2/sites-available/my_app.conf
:
WSGIDaemonProcess my_app threads=5
2
Answers
I ended up following a different approach. I added a limiter in the application code to take care of this.
Then all I need to do is apply that decorator to my views:
In practice, I only protected the ones that generate high traffic.
Doing this in application code is nice because I can set a limit per authenticated user and not per IP.
To do so, all I need to do is replace the default
get_remote_address
inkey_func
with a function that returns the user's unique identified.Note that this sets a different limit for each view function. If the limit needs to be global, it can be implemented differently. In fact, it would be even simpler.
I’d like them not to be bothered so separate data providers’ requests from data consumers (I’m not familiar with apache so I’m not showing you a production-ready config but an overall approach):
By limiting request amount per each IP you can harm user experience and still don’t solve your problem. For example, take note that many independent users may have same IPs because of how NAT and ISP work.
P.S. It’s quite strange that
ThreadsPerChild=25
butWSGIDaemonProcess my_app threads=5
. Are you sure that with that config all created threads by Apache would be utilized by WSGI server?