(Note – I’m aware of this apparently-duplicate question, but the answer provided there does not work)
I’m trying to convert a SqlAlchemy Join Query to JSON, such that the columns from both joined tables are reflected in the output. All the solutions I’ve been able to find are only able to reflect the columns from a single table, or to only reflect a single relationship rather than the full one-to-many relationship.
Minimal reproduction:
models.py
from dataclasses import dataclass
from typing import List
from sqlalchemy import create_engine, Column, ForeignKey, Integer, String
from sqlalchemy.orm import declarative_base, relationship, sessionmaker, Mapped
Base = declarative_base()
class BaseSerializable(Base):
__abstract__ = True
# https://stackoverflow.com/a/11884806/1040915
def as_dict(self):
return {c.name: getattr(self, c.name) for c in self.__table__.columns}
@dataclass
class Player(BaseSerializable):
__tablename__ = "customers"
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
scores: Mapped[List["Score"]] = relationship()
@dataclass
class Score(BaseSerializable):
__tablename__ = "scores"
id = Column(Integer, primary_key=True)
player_id: Mapped[int] = Column(Integer, ForeignKey("customers.id"))
value = Column(Integer)
engine = create_engine('sqlite:///database.db')
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base.metadata.create_all(bind=engine)
db = SessionLocal()
main.py
from json import dumps, JSONEncoder
from fastapi.encoders import jsonable_encoder
from models import db, Player, Score
from sqlalchemy import Row
from sqlalchemy.orm import DeclarativeMeta
db.query(Player).delete()
db.query(Score).delete()
db.add(
Player(
name="Test Player 1"
)
)
db.add(
Score(
player_id=1,
value=9001
)
)
db.add(
Score(
player_id=1,
value=9002
)
)
db.commit()
player: Player = (
db.query(Player)
.filter(Player.id == 1)
.join(Score)
.first()
)
# https://fastapi.tiangolo.com/tutorial/encoder/
def fastapi_json_encode(o):
return dumps(jsonable_encoder(o))
# https://stackoverflow.com/questions/35952694/how-to-serialize-sqlalchemy-join-query-to-json
class SqlAlchemyEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, Row):
data = {}
for obj in o:
data.update(self.parse_sqlalchemy_object(obj))
return data
if isinstance(o.__class__, DeclarativeMeta):
return self.parse_sqlalchemy_object(o)
return JSONEncoder.default(self, o)
def parse_sqlalchemy_object(self, o):
data = {}
fields = o.__json__() if hasattr(o, '__json__') else dir(o)
for field in [f for f in fields if not f.startswith('_') and f not in ['metadata', 'query', 'query_class', 'registry']]:
value = o.__getattribute__(field)
try:
dumps(value)
data[field] = value
except TypeError:
data[field] = None
return data
print(f'{SqlAlchemyEncoder().default(player)=}')
print(f'{fastapi_json_encode(player)=}')
print(f'{player.as_dict()=}')
I would have expected/hoped all three print
statements to print something like {"id": 1, "name": "Test Player 1", "scores": [{"id": 1, "value": 9001}, {"id": 2, "value": 9002}]}
, but in fact none of them do:
$ python3 main.py
SqlAlchemyEncoder().default(player)={'as_dict': None, 'id': 1, 'name': 'Test Player 1', 'scores': None}
fastapi_json_encode(player)='{"scores": [{"player_id": 1}, {"player_id": 1}]}'
player.as_dict()={'id': 1, 'name': 'Test Player 1'}
Specify multiple classes in query
I see from this answer that "sqlalchemy put in the select
part on the query whatever you pass to the query
method, so if you want to get 2 classes you [must reference both classes in the query
method]", which gets closer:
main.py
[...]
player: Player = (
db.query(Player, Score) # <------- change here
.filter(Player.id == 1)
.join(Score)
.first()
)
[...]
print(f'{SqlAlchemyEncoder().default(player)=}')
# The following two print statements error-out with the `.query(Player, Score)` change
# print(f'{fastapi_json_encode(player)=}')
# print(f'{player.as_dict()=}')
results in
$ python3 main.py
SqlAlchemyEncoder().default(player)={'as_dict': None, 'id': 1, 'name': 'Test Player 1', 'scores': None, 'player_id': 1, 'value': 9001}
This looks like it’s returning the serialization of the first row of the underlying join (which presumably has two rows, one for each Score
), rather than the serialized first Player
with its multiple Score
s. I can see how this makes sense from a technical sense, but it’s not what I’m trying to achieve.
2
Answers
I was able to simulate the outcome that I wanted by doing the "join" logic in Python, rather than on the database:
But it feels like I should be able to:
rather than having to do the join in Python itself
Just for example let’s assume you have table Price (id, product_id – fk) and Product (id, type). So, you can use following:
Then just create Pydantic model with corresponding fields:
And convert your result to it:
This is clean and convenient approach I’d recommend you to use. Hope it’ll help!