Is there a more efficient approach other than using a co-related query for solving the below problem? - Postgresql

Saim
February 2, 2023
193 views
0 votes
2 Answers

I have three database tables:

product
product_manufacturer
product_manufacturer_warranties.

The product table has a one-to-one mapping with product_manufacturer and the product_id is stored in the product_manufacturer table. The product_manufactuer table has a one-to-many mapping with the product_manufacturer_warranties table.

I need to write a query that retrieves all columns from the product table and two other columns that can be used to determine if a valid join exists for product and product_manufacturer and product_manufacturer and product_manufactuer_warranties respectively.

I have written the following co-related query that can handle the above scenario:

select product.*, pm.web_id,
        (   SELECT count(*)
            FROM product_manufacturer_warranty pmw
            WHERE pm.web_id = pmw.product_manufacturer_id)
            AS total_warranties
from product
left join product_manufacturer pm on product.web_id = pm.product_id

I wonder if there is a better or more efficient way of achieving this using SQL on the PostgreSQL server.

Answers

Do the aggregation once, and join to the result:

select product.*, 
       pm.web_id,
       pmw.total_warranties
from product
  left join product_manufacturer pm on product.web_id = pm.product_id
  left join (   
     SELECT product_manufacturer_id, count(*) as total_warranties
     FROM product_manufacturer_warranty pmw
     group by product_manufacturer_id
  ) as pmw on pm.web_id = pmw.product_manufacturer_id)

- ErwinBrandstetter
- February 3, 2023 at 6:02 pm
- 0 votes
0
Since you are only interested in bare existence, don’t run a potentially much more expensive ~~count()~~.
And while paging through the (potentially big) table, don’t compute counts for all products, which would be much more expensive, yet.

This should give you optimal performance:

Paging up:
```
SELECT p.*
     , m.product_id IS NOT NULL AS has_maufacturer
     , EXISTS (
         SELECT FROM product_manufacturer_warranty w
         WHERE  w.product_manufacturer_id = m.web_id
         ) AS has_warranties
FROM  (
   SELECT *
   FROM   product
   WHERE  product_id > $max_product_id_of_last_page
   -- more filters here?
   ORDER  BY product_id
   LIMIT  $page_size
   ) p
LEFT   JOIN product_manufacturer m ON p.web_id = m.product_id
```
The query returns exactly what you asked for:

all columns from the product table and two other columns that can be used to determine if a valid join exists for product and product_manufacturer and product_manufacturer and product_manufactuer_warranties respectively.
- First select products of interest in the subquery p
- Then left-join to product_manufacturer. Cannot multiply rows, defined as one-to-one relationship!
- Then check for bare existence with EXISTS. Substantially cheaper if there can be many related rows. See:
  - PL/pgSQL checking if a row exists
Paging down accordingly:
```
SELECT p.*
     , m.product_id IS NOT NULL AS has_maufacturer
     , EXISTS (
         SELECT FROM product_manufacturer_warranty w
         WHERE  w.product_manufacturer_id = m.web_id
         ) AS has_warranties
FROM  (
   SELECT *
   FROM   product
   WHERE  product_id < $min_product_id_of_last_page
   -- more filters here?
   ORDER  BY product_id DESC
   LIMIT  $page_size
   ) p
LEFT   JOIN product_manufacturer m ON p.web_id = m.product_id
ORDER  BY product_id;
```
Adapt to your actual sort order. This is more tricky for multiple ORDER BY expressions. Either way, don’t fall for LIMIT / OFFSET unless your table is trivially small. See:
- Optimize query with OFFSET on large table
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Is there a more efficient approach other than using a co-related query for solving the below problem? – Postgresql

Answers