Enforcing concurrent thread limit per IP in a WSGI/apache app

J233r244me
September 12, 2018
104 views
0 votes
2 Answers

We’re running a Flask app exposing data stored in a database. It returns a lot of 503 errors. My understanding is that those are generated by apache when the maximum number of concurrent threads is reached.

The root cause is most probably the app performing poorly but at this stage, we can’t afford much more development time, so I’m looking for a cheap deployment config hack to mitigate the issue.

Data providers are sending data at a high rate. I believe their program gets a lot of 503 and just try/catch those to retry until success.
Data consumers use the app at a much lower rate and I’d like them not to be bothered so much by those issues.

I’m thinking of limiting the number of concurrent accesses from the IP of each provider. They may get a lower throughput but they’d live with it as they already do, and it would make life easier for casual consumers.

I identified the mod_limitipconn which seems to be taylored for this.

mod_limitipconn […] allows administrators to limit the number of simultaneous requests permitted from a single IP address.

I’d like to be sure I understand how it works and how the limits are set.

I always figured there were a maximum of 5 simultaneous connection due to the WSGI settings: threads=5. But I read Processes and Threading in mod_wsgi docs and I’m confused.

Considering the configuration below, are those assumptions correct?

Only one instance of the application is running at a time.
A maximum of 5 concurrent threads can be spawned.
When 5 requests are being treated, if a sixth request arrives, the client gets a 503.
Limiting the number of simultaneous requests for IP x.x.x.x. at apache level to 3 would ensure than only 3 of those 5 threads can be used by that IP, leaving 2 to the other IPs.
Raising the number of threads in WSGI config could help share the connection pool amongst clients by providing more granularity in the rate limits (you can limit to 3 for each of 4 providers and keep 5 more with a total of 17) but would not improve the overall performance, even if the server has idle cores, because the Python GIL prevents several threads to run at the same time.
Raising the number of threads to a high number like 100 may make the requests longer but would limit 503 responses. It might even be enough if the clients set their own concurrent requests limit not too high and if they don’t, I can enforce that with something like mod_limitipconn.
Raising the number of threads too much would make the requests so long that the clients would get timeouts instead of 503 which is not really better.

Current config below. Not sure what matters.

apachectl -V:

Server version: Apache/2.4.25 (Debian)
Server built:   2018-06-02T08:01:13
Server's Module Magic Number: 20120211:68
Server loaded:  APR 1.5.2, APR-UTIL 1.5.4
Compiled using: APR 1.5.2, APR-UTIL 1.5.4
Architecture:   64-bit
Server MPM:     event
  threaded:     yes (fixed thread count)
    forked:     yes (variable process count)

/etc/apache2/apache2.conf:

# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On

#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100

/etc/apache2/mods-available/mpm_worker.conf (but that shouldn’t matter in event more, right?):

<IfModule mpm_worker_module>
        StartServers                     2
        MinSpareThreads          25
        MaxSpareThreads          75
        ThreadLimit                      64
        ThreadsPerChild          25
        MaxRequestWorkers         150
        MaxConnectionsPerChild   0
</IfModule>

/etc/apache2/sites-available/my_app.conf:

WSGIDaemonProcess my_app threads=5

Answers

Chosen as BEST ANSWER

I ended up following a different approach. I added a limiter in the application code to take care of this.

"""Concurrency requests limiter

Inspired by Flask-Limiter
"""

from collections import defaultdict
from threading import BoundedSemaphore
from functools import wraps

from flask import request
from werkzeug.exceptions import TooManyRequests


# From flask-limiter
def get_remote_address():
    """Get IP address for the current request (or 127.0.0.1 if none found)

    This won't work behind a proxy. See flask-limiter docs.
    """
    return request.remote_addr or '127.0.0.1'


class NonBlockingBoundedSemaphore(BoundedSemaphore):
    def __enter__(self):
        ret = self.acquire(blocking=False)
        if ret is False:
            raise TooManyRequests(
                'Only {} concurrent request(s) allowed'
                .format(self._initial_value))
        return ret


class ConcurrencyLimiter:

    def __init__(self, app=None, key_func=get_remote_address):
        self.app = app
        self.key_func = key_func
        if app is not None:
            self.init_app(app)

    def init_app(self, app):
        self.app = app
        app.extensions = getattr(app, 'extensions', {})
        app.extensions['concurrency_limiter'] = {
            'semaphores': defaultdict(dict),
        }

    def limit(self, max_concurrent_requests=1):
        def decorator(func):
            @wraps(func)
            def wrapper(*args, **kwargs):
                # Limiter not initialized
                if self.app is None:
                    return func(*args, **kwargs)
                identity = self.key_func()
                sema = self.app.extensions['concurrency_limiter'][
                    'semaphores'][func].setdefault(
                        identity,
                        NonBlockingBoundedSemaphore(max_concurrent_requests)
                    )
                with sema:
                    return func(*args, **kwargs)
            return wrapper
        return decorator


limiter = ConcurrencyLimiter()


def init_app(app):
    """Initialize limiter"""

    limiter.init_app(app)
    if app.config['AUTHENTICATION_ENABLED']:
        from h2g_platform_core.api.extensions.auth import get_identity
        limiter.key_func = get_identity

Then all I need to do is apply that decorator to my views:

@limiter.limit(1)  # One concurrent request by user
def get(...):
    ...

In practice, I only protected the ones that generate high traffic.

Doing this in application code is nice because I can set a limit per authenticated user and not per IP.

To do so, all I need to do is replace the default get_remote_address in key_func with a function that returns the user's unique identified.

Note that this sets a different limit for each view function. If the limit needs to be global, it can be implemented differently. In fact, it would be even simpler.

(Edit)

- Fine
- September 13, 2018 at 12:33 pm
- 0 votes
0
I’d like them not to be bothered so separate data providers’ requests from data consumers (I’m not familiar with apache so I’m not showing you a production-ready config but an overall approach):
```
<VirtualHost *>
    ServerName example.com

    WSGIDaemonProcess consumers user=user1 group=group1 threads=5
    WSGIDaemonProcess providers user=user1 group=group1 threads=5
    WSGIScriptAliasMatch ^/consumers_ulrs/.* /path_to_your_app/consumers.wsgi process-group=consumers
    WSGIScriptAliasMatch ^/providers_ulrs/.* /path_to_your_app/providers.wsgi process-group=providers

    ...

</VirtualHost>
```
By limiting request amount per each IP you can harm user experience and still don’t solve your problem. For example, take note that many independent users may have same IPs because of how NAT and ISP work.

P.S. It’s quite strange that ThreadsPerChild=25 but WSGIDaemonProcess my_app threads=5. Are you sure that with that config all created threads by Apache would be utilized by WSGI server?
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.