skip to Main Content

I’ve a Postgres table like this:

datetime tenant_id orders_today
2023-06-25 10:00 tenant2 2
2023-06-25 10:00 tenant1 1
2023-06-25 11:00 tenant1 5
2023-06-25 11:00 tenant2 2
2023-06-25 12:00 tenant1 5

Note that a orders_today for tenant2 hasn’t yet been generated for time 12:00.

I use a query like this to summaries orders today:

SELECT datetime, SUM(orders_today)
FROM orders
GROUP BY datetime

But this gives me this result:

datetime sum
2023-06-25 10:00 3
2023-06-25 11:00 7
2023-06-25 12:00 5

How can I make it ignore the group for time 12 where a count for tenant 2 is missing? And, if possible, can I make it use the previous value for tenant 2 from time 11?

2

Answers


  1. If I understand correctly, you want to exclude a group of datetimes if a tenant is missing, if so, you can use the having clause with the condition that the number of tenants for each datetime must equal the total number of tenants:

    select datetime, sum(orders_today)
    from mytable
    group by datetime
    having count(distinct tenant_id) = (select count(distinct tenant_id) from mytable)
    
    Login or Signup to reply.
  2. can I make it use the previous value for tenant 2 from time 11?

    Assuming:

    • (datetime, tenant_id) is the PRIMARY KEY. So: UNIQUE and bth columns NOT NULL.
    • You want results for a given time range.
    • You want results for a given set of tenants (‘tenant1’ and ‘tenant2’ in the example).
    • One row for every full hour in the time range is expected.
    • Substitute for missing rows with the value for the same tenant from the previous hour.
    • If we can’t get orders for each tenant this way, omit the row.

    Generate all combinations of interest from tenant & timestamp, LEFT JOIN to your table, LEFT JOIN to a possible substitute, then proceed:

    SELECT datetime, sum(orders) AS sum_orders_today
    FROM  (
       SELECT d.datetime, COALESCE(o.orders_today, o1.orders_today) AS orders
       FROM   generate_series(timestamp '2023-06-25 10:00'
                            , timestamp '2023-06-25 13:00'
                            , interval '1 hour') AS d(datetime)    -- given time range
       CROSS  JOIN (VALUES ('tenant1'), ('tenant2')) t(tenant_id)  -- given tenants
       LEFT   JOIN orders o USING (tenant_id, datetime)            -- regular data
       LEFT   JOIN orders o1 ON o1.tenant_id = t.tenant_id         -- substitutes
                            AND o1.datetime  = d.datetime - interval '1 hour'
       ) sub
    GROUP  BY 1
    HAVING count(orders) = 2  -- omit hours without full data set
    ORDER  BY 1;
    

    fiddle

    Similar:

    If my assumptions don’t hold, you have to be more specific.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search