skip to Main Content

Friends, hello everyone. We faced a problem at work that we couldn’t solve. It became clear that simple
there is no solution, so we decided to ask the advice of more experienced people on the forum.

  1. We have a PostgreSQL database and we work with it using the SqlAlchemy python framework using its ORM

  2. We have an entity that can be represented as:

    {   
        uuid: UUID
        data: {
            property_1: {
                value: str
            }
            property_2: {
                value: str
            }
            property_3: {
                value: str
            }
        }
    }

or as multiple nested entities

    Main
    {   
        uuid: UUID
        data: Data
    }


    Data
    {
        property_1: Property
        property_2: Property
        property_3: Property
    }


    Property
    {
        value: str
    }
  1. We have three tables that allow us to store such an entity
    main_table
    {
        id: int
        uuid: UUID  
    }


    data_table
    {
        id: int
        main_id: int = ForeignKey (main_table.id)
        property_1_id: ForeignKey (property_table.id)
        property_2_id: ForeignKey (property_table.id)
        property_3_id: ForeignKey (property_table.id)
    }


    property_table
    {
        id: int
        value: str
    }

It can be seen that the second table contains several columns of the same type (Property)

It turns out such a relationship by foreign keys between tables:

    main_table <--- data_table ---> property_table

If you poison an entity into the database, then it will correctly decompose into tables, while reading everything is also correctly assembled from tables into one object.

BUT there is a problem with the removal!
For ForeignKey in data_table ON DELETE CASCADE is set, so when we delete a record in main_table, the record in data_table is automatically deleted, and the records in property_table remain garbage, because they do not refer to the record in data_table, but vice versa, so there is a cascade deletion does not work.

Question. On the face of it, this is a fairly typical task, but we have not yet encountered this. I ask to prompt an approach or approaches in the decision.
We need to be able to delete the entire entity, that is, records in all tables associated with the entity. In other words, when deleting an entry in main_table, all entries associated with it from the data and property tables are deleted

Thank you in advance!

I tried to use ON DELETE CASCADE but figured out that it can’t work with such relation direction. It works only for deleting the entry that has a foreign key to another entry when deleting this another one

2

Answers


  1. To me it looks like a conceptual / design issue.

    Why are you storing values in separate table if they are not unique?

    If the property_table.value attribute does not make sense without corresponding record in data_table, it should not be stored in a separate table.

    Each object in your model should have a natural (aka "business") key, not only an artificial key.

    Maybe the model is "over-normalized"? Why not keep values in data_table?

    Login or Signup to reply.
  2. From what I understand, this is not possible with just SQLAlchemy. You can however use a trigger to achieve this. The basic idea is to create a trigger that deletes rows from tables you want. You can have SQLAlchemy create the trigger as soon as the tables are created. This is a complete example using SQLAlchemy 2.0.

    The end result of the code is that only one row that is not related to the other rows will remain, proving that the delete trigger works as intended.

    You may have to change the trigger as you see fit.

    Note that in this code I drop the tables and recreate them so that SQLAlchemy can publish the event that this code is listening. In this case "after create". So think twice before you run this on production database for testing purpose. You do not have to drop the tables if you are creating for the first time.

    from uuid import UUID, uuid4
    from sqlalchemy import create_engine, delete, text, ForeignKey
    from sqlalchemy.event import listens_for
    from sqlalchemy.orm import Mapped, DeclarativeBase, mapped_column, Session
    
    class Base(DeclarativeBase):
        pass
    
    class Main(Base):
        __tablename__ = "main"
        id: Mapped[int] = mapped_column(primary_key=True)
        uuid: Mapped[UUID]
    
    class Property(Base):
        __tablename__ = "property"
        id: Mapped[int] = mapped_column(primary_key=True)
        value: Mapped[str]
    
    class Data(Base):
        __tablename__ = "data"
        id: Mapped[int] = mapped_column(primary_key=True)
        main_id: Mapped[int] = mapped_column(ForeignKey(Main.id, ondelete="cascade"))
        property_1_id: Mapped[int] = mapped_column(
            ForeignKey(Property.id, ondelete="cascade")
        )
        property_2_id: Mapped[int] = mapped_column(
            ForeignKey(Property.id, ondelete="cascade")
        )
        property_3_id: Mapped[int] = mapped_column(
            ForeignKey(Property.id, ondelete="cascade")
        )
    
    @listens_for(Data.__table__, "after_create")
    def create_reverse_cascade_trigger(target, connection, **kw) -> None:
        query = """
        CREATE 
        OR REPLACE FUNCTION delete_rows() RETURNS TRIGGER AS $$ BEGIN 
        DELETE FROM 
          property 
        WHERE 
          property.id in (
            OLD.property_1_id, OLD.property_2_id, 
            OLD.property_3_id
          );
        RETURN OLD;
        END;
        $$ LANGUAGE plpgsql;
        """
    
        connection.execute(text(query))
    
        query = """
        CREATE TRIGGER reverse_cascade_trigger 
        AFTER 
          DELETE ON data FOR EACH ROW EXECUTE FUNCTION delete_rows();
        """
    
        connection.execute(text(query))
    
    connection_string = "your connection string"
    
    engine = create_engine(connection_string)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)
    
    with Session(engine) as session:
        # add some test data
        session.add(Main(id=1, uuid=uuid4()))
        session.add(Property(id=1, value="this will be deleted"))
        session.add(Property(id=2, value="this will be deleted"))
        session.add(Property(id=3, value="this will be deleted"))
        session.add(Property(id=4, value="this will not be deleted"))
        session.flush()
        session.add(Data(id=1, main_id=1, property_1_id=1, property_2_id=2, property_3_id=3))
        session.commit()
    
    with Session(engine) as session:
        # delete the added row from main, all tables must now be empty except one row in property as that is not related to the others
        session.execute(delete(Main).where(Main.id == 1))
        session.commit()
    

    If you already have the tables, I would suggest just running the two queries on the database directly and your code will continue to work as intended. The event listener part is just so that on a new database you can have SQLAlchemy run those queries for you when it creates the tables.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search