Given the following table mileage
, I should perform two sums for each vehicle_id
: once summing all entries for miles
and another – all entries for miles
only where is_processed = false
.
mileage
vehicle_id | miles | is_processed | created_at |
---|---|---|---|
773 | 800 | f | 2023-04-17 13:11:59.100846 |
767 | 200 | f | 2023-04-24 11:29:52.503024 |
767 | 100 | t | 2023-04-11 15:14:33.125348 |
773 | 200 | t | 2023-04-11 15:49:34.313639 |
767 | 150 | f | 2023-04-24 11:43:31.496871 |
This would be simple enough with one of two methods:
Method 1
select vehicle_id,
sum(miles) as total,
(select sum(miles) from mileage t2 where t1.vehicle_id = t2.vehicle_id and is_processed = false) extra
from mileage t1
group by vehicle_id;
Method 2
Fetch all records and compute with a single for
loop in my language of choice.
Desired solution
However, I want to know if there is a way to do all of that with PostgreSQL. I already tried window functions and could not figure it out, and grouping would involve a second scan of the same table which will hurt performance given that I expect to be processing hundreds of thousands of rows.
Any help is appreciated. Thanks!
2
Answers
Try a "conditional aggregate" which is really just putting a case expression inside the aggregate function.
Keep in mind the case expression evaluates to a value on each row, so the aggregate is still just summing values.
You do not need a subquery. Use the aggregate with filter
Read in the docs: Aggregate Expressions