skip to Main Content

I’m trying to create a summary report of sales per year with a couple of tables like the following:

products
id (PK)
[… ]
Orders
id (PK)
date
product_id (FK to products

I’ve written a query where I select from a time series built from the oldest to the most recent orders and then left joined that with the two tables, grouping by product_id and year timestamp:

SELECT 
    date_trunc('year'::text, years) AS year,
    products.id as product_id,
    coalesce(count(o.id),0) AS total_orders
   FROM generate_series(
       (select MIN(orders.date) from orders),
       (select MAX(orders.date) from orders),
       '1 year'::interval
   ) years
     left JOIN orders o ON date_trunc('year'::text, years) = date_trunc('year'::text, o.date)
     join products on products.id = o.product_id
     
  GROUP BY year, products.id
  ORDER BY year, products.id

The result of the query looks like this:

year product_id total_orders
2012-01-01 1 4
2013-01-01 1 3
2013-01-01 2 1

Due to the left join, I was expecting to see another row with coalescing the null results to zero:

year product_id total_orders
2012-01-01 2 0

Any ideas as to why the left join on the time series is not being returned as a row pointing at zero total_orders when no rows exist in orders for the given year?

2

Answers


  1. If you want to get all years since the earliest one, and for each of them, all possible products with total_orders for each. Demo at db<>fiddle:

    SELECT 
     years.year  AS year,
     products.id AS product_id,
     count(o.id) AS total_orders
    FROM generate_series(
        (select date_trunc('year', MIN(orders.date)) from orders),
        (select MAX(orders.date) from orders),
        '1 year') AS years(year)
      JOIN (select distinct id from products) AS products ON true
      LEFT JOIN orders o ON years.year = date_trunc('year',o.date)
                         AND products.id = o.product_id
    GROUP BY year, products.id
    ORDER BY year, products.id
    
    1. There’s no need to coalesce(count(o.id),0).
    2. If you date_trunc() in generate_series(), you don’t need to repeat it elsewhere.

    I think you might be reading your joins wrong. For your expected output to appear, this:

    years LEFT JOIN orders ... INNER JOIN products ...
    

    Would have to be evaluated like this:

    years LEFT JOIN (orders ... INNER JOIN products ...)
    

    Meanwhile, it’s evaluated sequentially, left-to-right, like so:

    (years LEFT JOIN orders ...) INNER JOIN products ...
    

    And it’s outlined in the SELECT doc along with a hint on how you can force alternative behaviour:

    Use parentheses if necessary to determine the order of nesting. In the absence of parentheses, JOINs nest left-to-right.

    Login or Signup to reply.
  2. See example. I think, group orders by product_id and year can be do on table orders before join to other tables, assuming orders table is largest table and has indexes on product_id and orders.date.

    SELECT 
         years
        ,products.id as product_id
        ,coalesce(o.total_orders,0) AS total_orders
    FROM products
    LEFT JOIN generate_series
       (
           cast(date_part('year',(select MIN(orders.date) from orders)) as int),
           cast(date_part('year',(select MAX(orders.date) from orders)) as int),
           1
       ) years on 1=1
    LEFT JOIN 
      (
         select product_id,date_part('year', orders.date) as order_years
                   ,count(*) total_orders
         from orders
         group by product_id,date_part('year', orders.date)
      ) o
      ON years = o.order_years and o.product_id=products.id
    ORDER BY years, products.id
    

    Let’s take a small example of data

    create table products (id int,ProductName varchar(20));
    insert into products values
     (1,'apple')
    ,(2,'banana')
    ,(3,'lemon')
    ;
    create table orders (id int, date date,product_id int);
    insert  into orders values
     (1,'2012-02-28',1)
    ,(2,'2012-03-28',2)
    ,(3,'2013-03-28',3)
    ;
    

    The basic query is the report framework – joins years and products

    years product_id
    2012 1
    2012 2
    2012 3
    2013 1
    2013 2
    2013 3

    Subquery, grouping orders

    select product_id,date_part('year', orders.date) as order_years
                   ,count(*) total_orders
    from orders
    group by product_id,date_part('year', orders.date)
    

    has result

    product_id order_years total_orders
    1 2012 1
    2 2012 1
    3 2013 1

    Join for above queries has result

    years product_id total_orders
    2012 1 1
    2012 2 1
    2012 3 null
    2013 1 null
    2013 2 null
    2013 3 1

    as expected, we have null’s in some positions – there use coalesce

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search