skip to Main Content

I have this table that has a ‘unique together’ index:

CREATE TABLE IF NOT EXISTS entity_properties
(
    entity_id INTEGER NOT NULL REFERENCES entities,
    property_id UUID NOT NULL REFERENCES properties,
    value VARCHAR,
    value_label VARCHAR,
    UNIQUE (entity_id, property_id)
);

I want to create an index on ‘value’ column to minimize search time:

CREATE INDEX index_property_value ON entity_properties (value)

I get this error:

index row requires 8296 bytes, maximum size is 8191

As the error clearly says creating this index would exceed the maximum limit size.

You can see this answer.

But I really need ‘value’ column to be indexed for efficiency reasons. In my database this table holds the largest part of data (millions of rows). Also it gets updated very frequently. As far as I know updating indexed columns has effects on performance. That is why I am concerned about performance

How can I achieve this?

PS: my other thought is that I can add the ‘value’ column to ‘unique together’ index.

CREATE TABLE IF NOT EXISTS entity_properties
(
    entity_id INTEGER NOT NULL REFERENCES entities,
    property_id UUID NOT NULL REFERENCES properties,
    value VARCHAR,
    value_label VARCHAR,
    UNIQUE (entity_id, property_id, value)
);

Can this be a solution? If so is it best approach? If not, what is the best approach

2

Answers


  1. It is an unusual requirement to search for long texts. To avoid the error and get efficient index access, use a hash of the column:

    CREATE INDEX ON entity_properties (hashtext(value));
    

    This can be used with a query like

    SELECT ...
    FROM entity_properties
    WHERE value = 'long string'
     AND hashtext(value) = hashtext('long string');
    

    The first condition is necessary to deal with hash collisions.

    Login or Signup to reply.
  2. PostgreSQL has a built-in hash index type which doesn’t suffer from this limitation, so you can just create one of those:

    CREATE INDEX index_property_value ON entity_properties using hash (value)
    

    This has the advantage (over using a functional index as Laurenz suggests) in that you don’t need to write your query in an unnatural way.

    But, is it sensible that the "value" column can contain values this large? Maybe the best solution would be to investigate the large data and clean it up if it is not sensible.

    Trying to add this as another column into an existing unique index would just make things worse. It would still need 8296 bytes, plus more for the other columns

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search