skip to Main Content

I am currently developing a web app using the aiohttp module. I’m using:

aiohttp.web, asyncio, uvloop, aiohttp_session, aiohttp_security, aiomysql, and aioredis

I have run some benchmarks against it and while they’re pretty good, I can’t help but want for more. I know that Python is, by nature, single-threaded. AIOHTTP is using async as to be non-blocking but am I correct in assuming that it is not utilizing all CPU cores?

My idea: Run multiple instances of my aiohttp.web code via concurrent.futures in multiprocessing mode. Each process would serve the site on a different port. I would then put a load balancer in front of them. MySQL and Redis can be used to share state where necessary such as for sessions.

Question: Given a server with several CPU cores, will this result in the desired performance increase? If so, is there any specific pattern to pursue in order to avert problems? I can’t think of anything that these aio modules are doing that would require that there only be a single thread though I could be wrong.

Note: This is not a subjective question as I’ve posed it. Either the module is currently bound to one thread/process or it isn’t – can benefit from a multiprocessing module + load balancer or can’t.

2

Answers


  1. You’re right asyncio uses one CPU only. (one event loop uses one thread only and thus one CPU only)

    Whether your whole project is network or CPU bound is something I can’t say.
    You have to try.

    You could use nginx or haproxy as load balancer.

    You might even try to use no load balancer at all. I never tried this feature for load balancing, just as proof of concept for a fail-over system.
    With new kernels multiple processes can listen to the same port (when using the SO_REUSEPORT option) and I guess it’s the kernel who would be doing a round robin.

    Here a small link to an article comparing performance of a typical nginx configuration vs an nginx setup with the SO_REUSEPORT feature:

    https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/

    It seems SO_REUSEPORT might distributes the CPU charge rather evenly, but might increase the variation of response times. Not sure this is relevant in your setup, but thought I let you know.

    Added 2020-02-04:

    My solution added 2019-12-09 works, but triggers a deprecation warning.

    When having more time and time for testing it myself I will post the improved solution here. For the time being you can find it at AIOHTTP – Application.make_handler(…) is deprecated – Adding Multiprocessing

    Added 2019-12-09:

    Here a small example of an HTTP server, that can be started multiple times listening on the same socket.
    The kernel would distribute the tasks. I never checked whether this is efficient or not though.

    reuseport.py:

    import asyncio
    import os
    import socket
    import time
    from aiohttp import web
    
    
    def mk_socket(host="127.0.0.1", port=8000, reuseport=False):
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        if reuseport:
            SO_REUSEPORT = 15
            sock.setsockopt(socket.SOL_SOCKET, SO_REUSEPORT, 1)
        sock.bind((host, port))
        return sock
    
    async def handle(request):
        name = request.match_info.get('name', "Anonymous")
        pid = os.getpid()
        text = "{:.2f}: Hello {}! Process {} is treating youn".format(
            time.time(), name, pid)
        time.sleep(0.5)  # intentionally blocking sleep to simulate CPU load
        return web.Response(text=text)
    
    if __name__ == '__main__':
        host = "127.0.0.1"
        port=8000
        reuseport = True
        app = web.Application()
        sock = mk_socket(host, port, reuseport=reuseport)
        app.add_routes([web.get('/', handle),
                        web.get('/{name}', handle)])
        loop = asyncio.get_event_loop()
        coro = loop.create_server(
            protocol_factory=app.make_handler(),
            sock=sock,
            )
        srv = loop.run_until_complete(coro)
        loop.run_forever()
    
    

    And one way to test it:

    ./reuseport.py & ./reuseport.py & 
    sleep 2 # sleep a little so servers are up
    for n in 1 2 3 4 5 6 7 8 ; do wget -q http://localhost:8000/$n -O - & done
    

    The output might look like:

    1575887410.91: Hello 1! Process 12635 is treating you
    1575887410.91: Hello 2! Process 12633 is treating you
    1575887411.42: Hello 5! Process 12633 is treating you
    1575887410.92: Hello 7! Process 12634 is treating you
    1575887411.42: Hello 6! Process 12634 is treating you
    1575887411.92: Hello 4! Process 12634 is treating you
    1575887412.42: Hello 3! Process 12634 is treating you
    1575887412.92: Hello 8! Process 12634 is treating you
    
    
    Login or Signup to reply.
  2. I think is better to not reinvent the wheel and use one of the proposed solutions at the documentation:
    https://docs.aiohttp.org/en/stable/deployment.html#nginx-supervisord

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search