I have two branches where I am using the exactly same PostgreSQL query but the group by clause becomes invalid for some reason.
I have changed the primary key from id
to composite (tenant_id, id)
.
The old documents table
Documents
Table "public.documents"
Column | Type | Collation | Nullable | Default
--------------------------------+-----------------------------+-----------+----------+---------------------------------------
id | integer | | not null | nextval('documents_id_seq'::regclass)
user_id | integer | | |
Indexes:
"documents_pkey" PRIMARY KEY, btree (id)
"index_documents_on_user_id" btree (user_id)
The new documents table
Documents
Column | Type | Collation | Nullable | Default
--------------------------------+-----------------------------+-----------+----------+---------------------------------------
id | integer | | not null | nextval('documents_id_seq'::regclass)
user_id | integer | | |
tenant_id | bigint | | not null |
Indexes:
"documents_pkey" PRIMARY KEY, btree (tenant_id, id)
"index_documents_on_user_id" btree (user_id)
"index_documents_on_tenant_id_and_id" UNIQUE, btree (tenant_id, id)
Foreign-key constraints:
"fk_rails_5ca55da786" FOREIGN KEY (tenant_id) REFERENCES tenants(id)
Now, for some reason, my SQL query is no longer valid on the new branch – I have no idea why? How does the grouping work? Why can I no longer use this query as it was before?
Here is my SQL
SELECT
"documents".*
FROM
"documents"
GROUP BY
"documents"."id"
(the error I am getting on the new branch)
ERROR: column "documents.user_id" must appear in the GROUP BY clause or be used in an aggregate function
2
Answers
Comments are incorrect, the solution is to do this.
There is no reason to group by
user_id
, for some reason, all the columns from the composite key need to be present in the group by clause.Normally, all columns that appear in the
SELECT
list outside of aggregate functions must be in theGROUP BY
clause. But if the primary key is inGROUP BY
, all the other columns of that table are implied and don’t need to be listed. So as soon as the column was no longer the primary key, you had to list the other one too.This was introduced in PostgreSQL commit e49ae8d3bc588294d07ce1a1272b31718cfca5ef, see the discussion here.