Postgresql - Looking for a more performant query

user3758745
August 15, 2023
127 views
0 votes
2 Answers

I have 2 tables with flex one-to-many relationship, meaning the parent side may not have any child:

Parent_table

id   dim1   quantity    price
1    "abc"     1         1.00

child_table:

id       parent_id
1          1
2          1

The parent query is very straight forward:

select dim1, count(quantity), sum(price) from parent_table group by dim1

Now, I want to also add another measure call "child_count" to the above query. By doing this, I get a very good performant query but naturally, it double counts the parent columns due to the join:

select p.dim1, count(p.quantity), sum(p.price), count(distinct c.id) as child_count
from parent_table p
  left outer join child_table c on c.parent_id = p.id
group by dim1

I believe the below query is exactly what I want, but these tables are extremely large and I cannot finish the query to verify:

select p.dim1, count(p.quantity), sum(p.price), sum(child_count) as child_count
from parent_table p
  left outer join (select c.parent_id, count(c.id) child_count from child_table c) child on p.id = child.parent_id
group by dim1

Do you have or do you know if Postgres has any smart way to achieve the query I wanted without doing a nested query to count the child records for each parent?

Answers

- Bohemian
- August 15, 2023 at 1:56 am
- 0 votes
0
Delete distinct from your query. This yields the same result, but is faster:
```
select dim1, count(p.quantity), sum(p.price), count(c.id) as child_count
from parent_table p
left join child_table c on c.parent_id = p.id
group by dim1
```
Being the PK of child_table, child_table.id is already unique so applying distinct will slow your query down for no benefit.
Login or Signup to reply.

- ErwinBrandstetter
- August 15, 2023 at 2:54 am
- 0 votes
0
Going out on a limb, you probably want this query:
```
SELECT p.dim1
     , sum(p.quantity) AS sum_quantity
     , sum(p.price) AS sum_price
     , COALESCE(sum(child_count), 0) AS sum_child_count
FROM   parent_table p
LEFT   JOIN (
   SELECT parent_id AS id, count(*) AS child_count
   FROM   child_table
   GROUP  BY 1
   ) c USING (id)
GROUP  BY dim1;
```
Filling in experience with this kind of questions for missing information.

The point being: aggregate rows in the child table before joining to the parent table. See:
- Query with LEFT JOIN not returning rows for count of 0
Throwing in COALESCE to convert possible null values.

Also converted to count(quantity) to sum(p.quantity) on suspicion . If quantity is always 1, use count(*) instead. And drop the column quantity.

An index on child_table.parent_id would help performance substantially – if you get an index-only scan out of it.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Looking for a more performant query

Answers