skip to Main Content

I have an async FastApi service running on kustomizse behind nginx and a NLB. Been working all good but doing some profiling locally I noticed the db query I have using sqlalchemy asyncSessions() and AsyncEngine seems to be a bit of a bottleneck, but I can not see much online and the way I have it set up I think is okay.

The query is as the below:

    async with AsyncSession(master_db) as session:
        result = (
            await session.execute(
                select(User, UserMapping, UserOrders, UserInvoices)
                .join(UserMapping, User.id == UserMapping.user_id)
                .outerjoin(
                    UserOrders,
                    and_(User.email == UserOrders.email),
                )
                .outerjoin(
                    UserInvoices,
                    and_(
                        UserInvoices.email == User.email,
                        UserInvoices.state == "paid",
                    ),
                )
                .where(UserMapping.uuid == uuid)
                .order_by(UserOrders.created_at.desc())
                .limit(3)
            )
        ).all()

and as for the engine, I create it like so:

@functools.cache
def get_master_db(settings: Settings = Depends(get_settings)):
    master_db_engine: AsyncEngine = create_async_engine(
        settings.master_db_url,
        pool_recycle=3600,
        pool_pre_ping=True
    )
    return master_db_engine

which is set as a dependency in my fastapi like so:

async def get_user_orders_and_invoices(
    user_uuid: int,
    master_db_engine: AsyncEngine = Depends(get_master_db):
   # make call to the method which has the above sqlalchemy call
   await get_user_order_and_invoices(master_db_engine,user_uuid)

As far from profiling using by pyinstrument the .execute() seems to sometimes take a while.

To tick some stuff I know people will mention:

  • The query is exactly what I need
  • Yes there are indexes on the right fields
  • The query in raw SQL if I hit directly in MySQL cli is takes a minimal amount of time
  • The schema of the tables can not change as they are legacy and is not possible at this time

The bottleneck is 100% at sqlalchemy level or at least at my Python code level. If there is any help that would be great thank you!

for reference:

  • using python 3.11
  • latest fastapi
  • aiomysql
  • sqalchemy 1.4

2

Answers


  1. try to create session like this:

    from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, AsyncEngine
    from sqlalchemy.orm import sessionmaker
    from sqlalchemy import select, and_
    import functools
    
    ###; Engine creation with caching and potential performance tuning parameters
    
    @functools.cache
    def get_master_db(db_url):
        master_db_engine: AsyncEngine = create_async_engine(
            db_url,
            echo=False,  # Turn off echo to reduce logging overhead
            pool_recycle=3600,
            pool_pre_ping=True,
            pool_size=10,  # Adjust pool size to your needs
            max_overflow=20  # Adjust max overflow to your needs
        )
        return master_db_engine
    
    ##; AsyncSession factory creation
    ##; Avoids automatic refresh of instances upon commit
    AsyncSessionFactory = sessionmaker(
        bind=get_master_db(),
        class_=AsyncSession,
        expire_on_commit=False  
    )
    
    
    Login or Signup to reply.
  2. Overall, these optimizations can help improve the performance of your SQLAlchemy async query by reducing the amount of data processed and the number of database queries executed.

    from sqlalchemy import subquery
    
    async with AsyncSession(master_db) as session:
        orders_subquery = (
            select(UserOrders)
            .filter(UserOrders.email == User.email)
            .order_by(UserOrders.created_at.desc())
            .limit(3)
        )
    
        result = (
            await session.execute(
                select(User, UserMapping)
                .join(UserMapping, User.id == UserMapping.user_id)
                .options(selectinload(User.orders), selectinload(User.invoices))
                .where(UserMapping.uuid == uuid)
                .filter(User.email.in_(select([UserOrders.email]).select_from(orders_subquery)))
            )
        ).all()
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search