skip to Main Content

I have three database tables:

  • product
  • product_manufacturer
  • product_manufacturer_warranties.

The product table has a one-to-one mapping with product_manufacturer and the product_id is stored in the product_manufacturer table. The product_manufactuer table has a one-to-many mapping with the product_manufacturer_warranties table.

I need to write a query that retrieves all columns from the product table and two other columns that can be used to determine if a valid join exists for product and product_manufacturer and product_manufacturer and product_manufactuer_warranties respectively.

I have written the following co-related query that can handle the above scenario:

select product.*, pm.web_id,
        (   SELECT count(*)
            FROM product_manufacturer_warranty pmw
            WHERE pm.web_id = pmw.product_manufacturer_id)
            AS total_warranties
from product
left join product_manufacturer pm on product.web_id = pm.product_id

I wonder if there is a better or more efficient way of achieving this using SQL on the PostgreSQL server.

2

Answers


  1. Do the aggregation once, and join to the result:

    select product.*, 
           pm.web_id,
           pmw.total_warranties
    from product
      left join product_manufacturer pm on product.web_id = pm.product_id
      left join (   
         SELECT product_manufacturer_id, count(*) as total_warranties
         FROM product_manufacturer_warranty pmw
         group by product_manufacturer_id
      ) as pmw on pm.web_id = pmw.product_manufacturer_id)
    
    Login or Signup to reply.
  2. Since you are only interested in bare existence, don’t run a potentially much more expensive count().
    And while paging through the (potentially big) table, don’t compute counts for all products, which would be much more expensive, yet.

    This should give you optimal performance:

    Paging up:

    SELECT p.*
         , m.product_id IS NOT NULL AS has_maufacturer
         , EXISTS (
             SELECT FROM product_manufacturer_warranty w
             WHERE  w.product_manufacturer_id = m.web_id
             ) AS has_warranties
    FROM  (
       SELECT *
       FROM   product
       WHERE  product_id > $max_product_id_of_last_page
       -- more filters here?
       ORDER  BY product_id
       LIMIT  $page_size
       ) p
    LEFT   JOIN product_manufacturer m ON p.web_id = m.product_id
    

    The query returns exactly what you asked for:

    all columns from the product table and two other columns that can be used to determine if a valid join exists for product and product_manufacturer and product_manufacturer and product_manufactuer_warranties respectively.

    • First select products of interest in the subquery p
    • Then left-join to product_manufacturer. Cannot multiply rows, defined as one-to-one relationship!
    • Then check for bare existence with EXISTS. Substantially cheaper if there can be many related rows. See:

    Paging down accordingly:

    SELECT p.*
         , m.product_id IS NOT NULL AS has_maufacturer
         , EXISTS (
             SELECT FROM product_manufacturer_warranty w
             WHERE  w.product_manufacturer_id = m.web_id
             ) AS has_warranties
    FROM  (
       SELECT *
       FROM   product
       WHERE  product_id < $min_product_id_of_last_page
       -- more filters here?
       ORDER  BY product_id DESC
       LIMIT  $page_size
       ) p
    LEFT   JOIN product_manufacturer m ON p.web_id = m.product_id
    ORDER  BY product_id;
    

    Adapt to your actual sort order. This is more tricky for multiple ORDER BY expressions. Either way, don’t fall for LIMIT / OFFSET unless your table is trivially small. See:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search