skip to Main Content

Given the following table mileage, I should perform two sums for each vehicle_id: once summing all entries for miles and another – all entries for miles only where is_processed = false.

mileage

vehicle_id miles is_processed created_at
773 800 f 2023-04-17 13:11:59.100846
767 200 f 2023-04-24 11:29:52.503024
767 100 t 2023-04-11 15:14:33.125348
773 200 t 2023-04-11 15:49:34.313639
767 150 f 2023-04-24 11:43:31.496871

This would be simple enough with one of two methods:

Method 1

select vehicle_id,
  sum(miles) as total,
  (select sum(miles) from mileage t2 where t1.vehicle_id = t2.vehicle_id and is_processed = false) extra
from mileage t1
group by vehicle_id;

Method 2

Fetch all records and compute with a single for loop in my language of choice.

Desired solution

However, I want to know if there is a way to do all of that with PostgreSQL. I already tried window functions and could not figure it out, and grouping would involve a second scan of the same table which will hurt performance given that I expect to be processing hundreds of thousands of rows.

Any help is appreciated. Thanks!

2

Answers


  1. Try a "conditional aggregate" which is really just putting a case expression inside the aggregate function.

    SELECT vehicle_id
        ,sum(miles) AS total
        ,sum(case when is_processed = false then miles else 0 end) AS extra
    FROM mileage t1
    GROUP BY vehicle_id;
    

    Keep in mind the case expression evaluates to a value on each row, so the aggregate is still just summing values.

    Login or Signup to reply.
  2. You do not need a subquery. Use the aggregate with filter

    select
        vehicle_id,
        sum(miles) as total,
        sum(miles) filter (where not is_processed) as extra
    from mileage
    group by vehicle_id;
    

    Read in the docs: Aggregate Expressions

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search