skip to Main Content

(Note – I’m aware of this apparently-duplicate question, but the answer provided there does not work)

I’m trying to convert a SqlAlchemy Join Query to JSON, such that the columns from both joined tables are reflected in the output. All the solutions I’ve been able to find are only able to reflect the columns from a single table, or to only reflect a single relationship rather than the full one-to-many relationship.

Minimal reproduction:

models.py

from dataclasses import dataclass
from typing import List

from sqlalchemy import create_engine, Column, ForeignKey, Integer, String

from sqlalchemy.orm import declarative_base, relationship, sessionmaker, Mapped

Base = declarative_base()

class BaseSerializable(Base):
    __abstract__ = True
    # https://stackoverflow.com/a/11884806/1040915
    def as_dict(self):
       return {c.name: getattr(self, c.name) for c in self.__table__.columns}

@dataclass
class Player(BaseSerializable):
    __tablename__ = "customers"

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    scores: Mapped[List["Score"]] = relationship()

@dataclass
class Score(BaseSerializable):
    __tablename__ = "scores"

    id = Column(Integer, primary_key=True)
    player_id: Mapped[int] = Column(Integer, ForeignKey("customers.id"))
    value = Column(Integer)

engine = create_engine('sqlite:///database.db')
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base.metadata.create_all(bind=engine)
db = SessionLocal()

main.py

from json import dumps, JSONEncoder
from fastapi.encoders import jsonable_encoder
from models import db, Player, Score
from sqlalchemy import Row
from sqlalchemy.orm import DeclarativeMeta

db.query(Player).delete()
db.query(Score).delete()
db.add(
    Player(
        name="Test Player 1"
    )
)

db.add(
    Score(
        player_id=1,
        value=9001
    )
)
db.add(
    Score(
        player_id=1,
        value=9002
    )
)
db.commit()


player: Player = (
    db.query(Player)
        .filter(Player.id == 1)
        .join(Score)
        .first()
)

# https://fastapi.tiangolo.com/tutorial/encoder/
def fastapi_json_encode(o):
    return dumps(jsonable_encoder(o))

# https://stackoverflow.com/questions/35952694/how-to-serialize-sqlalchemy-join-query-to-json
class SqlAlchemyEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, Row):
            data = {}
            for obj in o:
                data.update(self.parse_sqlalchemy_object(obj))
            return data
        if isinstance(o.__class__, DeclarativeMeta):
            return self.parse_sqlalchemy_object(o)
        return JSONEncoder.default(self, o)


    def parse_sqlalchemy_object(self, o):
        data = {}
        fields = o.__json__() if hasattr(o, '__json__') else dir(o)
        for field in [f for f in fields if not f.startswith('_') and f not in ['metadata', 'query', 'query_class', 'registry']]:
            value = o.__getattribute__(field)
            try:
                dumps(value)
                data[field] = value
            except TypeError:
                data[field] = None
        return data

print(f'{SqlAlchemyEncoder().default(player)=}')
print(f'{fastapi_json_encode(player)=}')
print(f'{player.as_dict()=}')

I would have expected/hoped all three print statements to print something like {"id": 1, "name": "Test Player 1", "scores": [{"id": 1, "value": 9001}, {"id": 2, "value": 9002}]}, but in fact none of them do:

$ python3 main.py
SqlAlchemyEncoder().default(player)={'as_dict': None, 'id': 1, 'name': 'Test Player 1', 'scores': None}
fastapi_json_encode(player)='{"scores": [{"player_id": 1}, {"player_id": 1}]}'
player.as_dict()={'id': 1, 'name': 'Test Player 1'}

Specify multiple classes in query

I see from this answer that "sqlalchemy put in the select part on the query whatever you pass to the query method, so if you want to get 2 classes you [must reference both classes in the query method]", which gets closer:

main.py

[...]
player: Player = (
    db.query(Player, Score) # <------- change here
        .filter(Player.id == 1)
        .join(Score)
        .first()
)

[...]

print(f'{SqlAlchemyEncoder().default(player)=}')
# The following two print statements error-out with the `.query(Player, Score)` change
# print(f'{fastapi_json_encode(player)=}')
# print(f'{player.as_dict()=}')

results in

$ python3 main.py
SqlAlchemyEncoder().default(player)={'as_dict': None, 'id': 1, 'name': 'Test Player 1', 'scores': None, 'player_id': 1, 'value': 9001}

This looks like it’s returning the serialization of the first row of the underlying join (which presumably has two rows, one for each Score), rather than the serialized first Player with its multiple Scores. I can see how this makes sense from a technical sense, but it’s not what I’m trying to achieve.

2

Answers


  1. Chosen as BEST ANSWER

    I was able to simulate the outcome that I wanted by doing the "join" logic in Python, rather than on the database:

    player: Player = (
        db.query(Player)
            .filter(Player.id == 1)
            .first()
    )
    
    ...
    
    base_data = player.as_dict()
    base_data["scores"] = [score.as_dict() for score in player.scores]
    print(f'{base_data=}')base_data = player.as_dict()
    base_data["scores"] = [score.as_dict() for score in player.scores]
    print(f'{base_data=}')
    
    # ...gives
    
    base_data={'id': 1, 'name': 'Test Player 1', 'scores': [{'id': 1, 'player_id': 1, 'value': 9001}, {'id': 2, 'player_id': 1, 'value': 9002}]}
    

    But it feels like I should be able to:

    • Join using a database (which is specialized for such things!)
    • Collate the resulting rows (using SQLAlchemy) into a single object

    rather than having to do the join in Python itself


  2. Just for example let’s assume you have table Price (id, product_id – fk) and Product (id, type). So, you can use following:

    result = (
        await session.execute(
            select(
                Price.c.id.label("price_id"),
                Price.c.product_id,
                Product.c.type,
                # Any other columns you need from joined tables
            )
            .outerjoin(models.Product, onclause=Price.c.product_id == Product.c.id)
        )
    ).one_or_none()
    

    Then just create Pydantic model with corresponding fields:

    class PriceProductDb(pydantic.BaseModel):
        model_config = ConfigDict(from_attributes=True)
    
        price_id: uuid.UUID
        product_id: uuid.UUID
        type: ProductType
    

    And convert your result to it:

    PriceProductDb.model_validate(result)
    

    This is clean and convenient approach I’d recommend you to use. Hope it’ll help!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search