Counting rows from multiple tables in a single query in PostgreSQL

vinnylinux
September 3, 2023
276 views
0 votes
2 Answers

I’ve been trying to make a query more efficient by bringing together counts from multiple tables. I have a schema similar to this:

create table "impressions" (
    id serial primary key,
    ip inet,
    ad_id integer references ad(id),
    user uuid references user(id),
    created_at timestamp default now()
)

create table "purchases" (
    id serial primary key,
    ip inet,
    ad_id integer references ad(id),
    user uuid references user(id),
    // ... few other columns
    created_at timestamp default now()
)

create table "clicks" (
    id serial primary key,
    ip inet,
    ad_id integer references ad(id),
    user uuid references user(id),
    created_at timestamp default now()
)

And I’ve been trying to get a count(*) of impressions, purchases and clicks grouped by user. And later by ad_id. I’m currently doing 3 separate queries to the database and aggregating in application code, but wondered if there is a more efficient way of doing this.

Answers

- ErwinBrandstetter
- September 2, 2023 at 3:13 am
- 0 votes
0
You can combine all three counts into a single query to make it a single round-trip to the DB server, which saves some overhead, and reduces I/O.

OTOH, joining rows adds some cost (depending on the number of distinct users). But if you are going to do that anyway, it’s typically much cheaper in Postgres than in the client. Postgres is optimized for the task and can work with original data types.
```
SELECT u.id, i.impressions_count, p.purchases_count, c.clicks_count
FROM   "user" u
LEFT   JOIN (
   SELECT "user" AS id, count(*) AS impressions_count
   FROM   impressions
   GROUP  BY 1
   ) i USING (id)
LEFT   JOIN (
   SELECT "user" AS id, count(*) AS purchases_count
   FROM   purchases
   GROUP  BY 1
   ) p USING (id)
LEFT   JOIN (
   SELECT "user" AS id, count(*) AS clicks_count
   FROM   clicks
   GROUP  BY 1
   ) c USING (id);
```
By basing the query on the "user" table, users without any entries in the three subsidiary tables are listed, too. If you don’t want those, combine the subqueries with FULL [OUTER] JOIN instead:
```
SELECT id, i.impressions_count, p.purchases_count, c.clicks_count
FROM  (
   SELECT "user" AS id, count(*) AS impressions_count
   FROM   impressions
   GROUP  BY 1
   ) i
FULL   JOIN (
   SELECT "user" AS id, count(*) AS purchases_count
   FROM   purchases
   GROUP  BY 1
   ) p USING (id)
FULL   JOIN (
   SELECT "user" AS id, count(*) AS clicks_count
   FROM   clicks
   GROUP  BY 1
   ) c USING (id)
```
Either query reports null where no rows are found. You may want to convert that to 0 with COALESCE(i.impressions_count, 0) AS impressions_count etc.

If table rows are wide (and your tables are vacuum’ed enough) you can gain performance from an index ("user") on each subsidiary table. You probably have that anyway.

Aside: "user" is a fully reserved word in SQL. I’d rather not use it as identifier, even if that’s allowed with double-quotes. Too error-prone.
Login or Signup to reply.

- V233race
- September 3, 2023 at 8:10 am
- 0 votes
0
Maybe I’ve misunderstood, but is this what you want?
```
WITH cte AS
(
  SELECT
    _user, COUNT(_user) AS cnt
  FROM impressions
  GROUP BY _user
  UNION ALL
  SELECT
    _user, COUNT(_user)
  FROM purchases  
  GROUP BY _user
  UNION ALL
  SELECT
    _user, COUNT(_user)
  FROM clicks
  GROUP BY _user
)
SELECT _user, SUM(cnt) 
FROM cte
GROUP BY _user;
```
See https://dbfiddle.uk/yGX0Pv0k
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.