skip to Main Content

I am genuinely baffled by the behavior I’m seeing so I’m hoping someone can shed some light onto what’s going on here. Basically, I have a web app using FastAPI and SQLAlchemy in the back end that involves users posting responses to a survey. In the endpoint accepting the post request for a response, a Response object is created in the database, then a number of Match objects are created in the database with a foreign key to the Response object that was created. During the parsing and creation of the Match objects from the payload, I aggregate a number_correct_matches variable (which is a positive integer) and I then update the Response object to include number_correct_matches via a SQLAlchemy Session in a crud class that has an update function.

The behavior that is baffling me is that when I execute the code that performs this updating operation either when running the full project locally on my raspberry pi 4 where I’m developing (using a local postgresql db) or via a unit test (using an in memory sqlite db), the update in the database does not happen for the Response object. However, there are two scenarios where the update works:

  1. If I run the unit test in debug mode in vscode, and step through the for loop in the update function one line at a time, the Response object is updated correctly in the database.

  2. If I put a logger statement in the update function and iff the logging statement prints db_object that is being updated, the Response object is updated correctly. If I put a logger statement that does not include db_object, the logger statement is output to the logs, but the object is not updated.

The expected behavior only happens for either of the two above scenarios, otherwise the code is literally skipped (see how below). Additionally, I have other models in the app that utillize the same update function successfully for different attributes (e.g. there is an update endpoint for user objects that uses the same function and I can successfully update a user’s password through it).

I do not understand how the code seemingly only gets executed when directly stepped through in debug mode or when the db_object argument is printed. I know I could just loop through the Match objects in the payload before creating them in the database just to get the number_correct_matches value and create the Response object with that value in the first place, but I really want to understand how it works only when I step through one line at a time or when I have a specific logger statement and not when running the code normally without superfluous logs.

api/response.py

response = crud.response.update(
    session=session,
    db_object=response, # <- this is the already created Response object in the database
    object_in=schemas.ResponseUpdate(
        number_correct_matches=number_correct_matches
    ),
)

schemas/response.py

# properties to receive via API update
class ResponseUpdate(ResponseBase):
    number_correct_matches: int = 0

crud_base.py (extended by crud_response)

def update(
        self,
        session: Session,
        *,
        db_object: ModelType,
        object_in: Union[UpdateSchemaType, Dict[str, Any]]
    ) -> ModelType:
        """
        Updates an object in the database

        Args:
            session (Session): a SQLAlchemy Session object that is connected to the database.
            db_object (ModelType): An object from the database.
            object_in (Union[UpdateSchemaType, Dict[str, Any]]): A pydantic model object used to update the db model object.

        Returns:
            ModelType: The updated db model object.
        """
        # logger.info(f"updating {db_object}") <- with just this logging statement, it works correctly
        logger.info("updating 1")
        # update the db object
        update_data = ( # <- If breakpoint is set here and then I jump to below the for loop without stepping through, it does not work
            object_in
            if isinstance(object_in, dict)
            else object_in.model_dump(exclude_unset=True)
        )
        for field in jsonable_encoder(db_object): # <- It works if breakpoint is set here and then I walk through one line at a time to the end of the for loop
            logger.info("updating 2")
            if field in update_data.keys():
                logger.info("updating 3")
                setattr(db_object, field, update_data[field])
        logger.info("updating 4")
        # update the db object through the session
        session.add(db_object) # <- If breakpoint is set here, it does not work
        session.commit()
        session.refresh(db_object)
        return db_object

The above code when run as is (with the first logger statement commented out) results in the following logs:

2024-07-12 20:53:52.228 | INFO     | app.crud.crud_base:update:82 - updating 1
2024-07-12 20:53:52.229 | INFO     | app.crud.crud_base:update:94 - updating 4

I’ll also add that even though I posted the code in the response endpoint, I also call crud.response.update in conftest.py and I get the exact same behavior.

2

Answers


  1. Chosen as BEST ANSWER

    So this is a partial answer in that it fixes the issue, but I still don't fully understand all the behavior I described in the original question. Adding a session.refresh(response) right before calling the update on the response results in the correct behavior. I assume the issue is that I have a many-to-one relationship between the Match table and the Response table and when I create the Match objects before updating response, the Match objects have an attribute that is a foreign key to the Response object's id. My best guess as to what's happening is that when the Match objects are created, the database table for Response is updated with references to the newly created Match objects, but the response object in memory is then mismatched with what's in the database. I'm not sure why logger statements or walking through with the debugger would cause the object to get updated, but once the object is refreshed by the session, it's able to be updated in the code.


  2. Not a full answer, just a lead to debug further (that wouldn’t fit in a comment)

    What I believe is happening is that there is some weird caching edge-case happening here:

    • if you call the object’s __repr__ it has side-effects (some values get cached in db_object), which changes the output of its
    • stepping with the debugger somehow has side-effects? The debugger might access some of the fields of your object internally, though I find that a little hard to believe.

    Two leads for debugging:

    1. What happens if you add a breakpoint inside of jsonable_encoder. Does it work? If not, you could investigate why exactly it returns an empty sequence

    2. what happens if you replace

            for field in jsonable_encoder(db_object): # <- It works if breakpoint is set here and then I walk through one line at a time to the end of the for loop
                logger.info("updating 2")
                if field in update_data.keys():
                    logger.info("updating 3")
                    setattr(db_object, field, update_data[field])
    

    with

            fields = jsonable_encoder(db_object)
            for field in fields: # <- It works if breakpoint is set here and then I walk through one line at a time to the end of the for loop
                logger.info("updating 2")
                if field in update_data.keys():
                    logger.info("updating 3")
                    setattr(db_object, field, update_data[field])
            else:
                logger.error(f"no fields found: {fields=}, {db_object=}")            
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search