I have a table like this:
CREATE TABLE "users"
(
id serial PRIMARY KEY,
town text NOT NULL,
street text not null,
building text
);
I want to be able to store unique entries by 3 columns "town", "street" and "building". If the third row "building" is null, there should be impossible to store any new other rows with same "town" or "street" column, no matter what is in the column "building".
So this should work:
---- example 1:
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 8);
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 9);
---- example 2:
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', null);
But this should not:
---- example 1:
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 8);
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 9);
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', null);
---- example 2:
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 8);
INSERT INTO "users" ("town", "street", "building") VALUES ('t', 's', 8);
I tried to use two partial indexes for this:
CREATE UNIQUE INDEX "two_cols"
ON "users" ("town", "street")
WHERE "building" IS NULL;
CREATE UNIQUE INDEX "three_cols"
ON "users" ("town", "street", "building")
WHERE "building" IS NOT NULL;
but the problem that you can only set filter for current index, so first index only checks inside their query, and this allows to store null value with other values, which is not what I need. Removing filter on first index disallow to store two rows with same two but different third column. Is there an option to solve this problem?
2
Answers
Your first INSERT with twi the same data, fails and because both are treated as one transaction, and a roll back happen, which erases both insert.
this you can see as the second insert works perfectly. the builng 8 is not proesent.
when you make every insert its own transaction, it will enter the row and the second is no problem as it is in its own transaction, which when it rolls back, has 7 already saved.
so in short use transactions
fiddle
You want
NULL
to conflict with all values (incl.NULL
). But distinct notnull values shall not conflict with each other. Try as you might, you won’t cover that with aUNIQUE
constraint (or index).This does exactly what you ask for:
fiddle
Constraints and indexes
A Unique constraint is based on a B-tree index using equality checks at its core. An exclusion constraint is based on other index types, as of pg 16, GiST or SP-GiST, and can use additional operators – in particular the "overlaps" operator
&&
. But that’s not meant fortext
values. To get there, hash thetext
toint8
and build anint8range
from it, which allows&&
.I use the hash function
hashtextextended()
. You could usehastext()
to buildint4range
instead. Smaller and a bit faster, but increased chances for a hash collision. Both are built-in Postgres functions – undocumented, but reliable. See:The exclusion constraint enforces your special flavor of "uniqueness" completely, the added unique constraint is logically redundant. I kept it anyway. Its underlying B-tree index typically helps performance of many operations on the table. I implemented with
NULLS NOT DISTINCT
since that is closer to your case. See:Might also just be a plain index with default
NULLS DISTINCT
. The optimal set of constraints and indexes depends on details of your setup.Building
int8range
fromtext
I encapsulated that task into the (optional) auxiliary function
texthash_int8range(text)
. First,text
is hashed to abigint
(int8
) withhashtextextended()
.text NULL
is mapped toint8 NULL
, which happens to be what we need.An
int8range
is built with it, containing only the one value. In range types,NULL
means "unbounded", so the range forNULL
input becomes the unbounded range, overlapping with all. See:Now everything falls into place for your flavor of uniqueness. The probability for a hash collision is practically zero for moderately large cardinalities, but still possible. If that’s an issue, this solution is not for you.
Building a multicolumn exclusion constraint
Since queries will be supported by the additional (faster) B-tree index, the purpose of the exclusion constraints is mostly just to enforce your rules. So I optimized for write performance and storage footprint rather than its utility as versatile index and condensed the leading two columns into a single hash with
hash_record_extended((town, street), 0)
. That makes more sense the longer typical values are. For very shorttext
values it might not pay.We need the equality operator for that, which is not typically useful for GiST indexes, as B-tree indexes are better at that. But for the special case of a multicolumn constraint, we need it anyway. Postgres provides the required operator class(es) with the additional module
btree_gist
. Install that first. See:To optimize, I use the minimal form of an SQL-standard function. (Can be a plain SQL function, too.) See:
But it must be
IMMUTABLE
. And should bePARALLEL SAFE
.