I have this table that has a ‘unique together’ index:
CREATE TABLE IF NOT EXISTS entity_properties
(
entity_id INTEGER NOT NULL REFERENCES entities,
property_id UUID NOT NULL REFERENCES properties,
value VARCHAR,
value_label VARCHAR,
UNIQUE (entity_id, property_id)
);
I want to create an index on ‘value’ column to minimize search time:
CREATE INDEX index_property_value ON entity_properties (value)
I get this error:
index row requires 8296 bytes, maximum size is 8191
As the error clearly says creating this index would exceed the maximum limit size.
But I really need ‘value’ column to be indexed for efficiency reasons. In my database this table holds the largest part of data (millions of rows). Also it gets updated very frequently. As far as I know updating indexed columns has effects on performance. That is why I am concerned about performance
How can I achieve this?
PS: my other thought is that I can add the ‘value’ column to ‘unique together’ index.
CREATE TABLE IF NOT EXISTS entity_properties
(
entity_id INTEGER NOT NULL REFERENCES entities,
property_id UUID NOT NULL REFERENCES properties,
value VARCHAR,
value_label VARCHAR,
UNIQUE (entity_id, property_id, value)
);
Can this be a solution? If so is it best approach? If not, what is the best approach
2
Answers
It is an unusual requirement to search for long texts. To avoid the error and get efficient index access, use a hash of the column:
This can be used with a query like
The first condition is necessary to deal with hash collisions.
PostgreSQL has a built-in hash index type which doesn’t suffer from this limitation, so you can just create one of those:
This has the advantage (over using a functional index as Laurenz suggests) in that you don’t need to write your query in an unnatural way.
But, is it sensible that the "value" column can contain values this large? Maybe the best solution would be to investigate the large data and clean it up if it is not sensible.
Trying to add this as another column into an existing unique index would just make things worse. It would still need 8296 bytes, plus more for the other columns