I have a table items
and a table batches
. A batch can have n items associated by items.batch_id
.
I’d like to write a query item counts in two groups batched
and unbatched
:
- items WHERE batch_id IS NOT NULL (batched)
- items WHERE batch_id IS NULL (unbatched)
The result should look like this
batched | unbatched |
---|---|
1200000 | 100 |
Any help appreciated, thank you!
EDIT:
I got stuck with using GROUP BY
which turned out to be the wrong tool for the job.
2
Answers
You can use
COUNT
with `FILTER( WHERE)it is called conditional count
fiddle
The
count()
function seems to be the most likely basic tool here. Given an expression, it returns a count of the number of rows where that expression evaluates to non-null. Given the argument*
, it counts all rows in the group.To the extent that there is a trick, it is getting the batched an unbatched counts in the same result row. There are at least three ways to do that:
That’s pretty straightforward. Each subquery is executed and produces one column of the result. Because no
FROM
clause is given in the outer query, there will be exactly one result row.That will compute the
batched
andunbatched
results for the whole table on every result row, one per row of theitems
table, but then only one result row is actually returned. It is reasonable to hope (though you would definitely want to test) that postgres doesn’t actually compute those counts for all the rows that are culled by thelimit
clause. You might, for example, compare the performance of this option with that of the previous option.count()
with afilter
clause, as described in detail in another answer.