Postgresql - Postgres: counting records in two groups (existing foreign key or null)

Robin
November 9, 2022
223 views
0 votes
2 Answers

I have a table items and a table batches. A batch can have n items associated by items.batch_id.

I’d like to write a query item counts in two groups batched and unbatched:

items WHERE batch_id IS NOT NULL (batched)
items WHERE batch_id IS NULL (unbatched)

The result should look like this

batched	unbatched
1200000	100

Any help appreciated, thank you!

EDIT:
I got stuck with using GROUP BY which turned out to be the wrong tool for the job.

Tags: postgresql

Answers

- nbk
- November 9, 2022 at 11:57 pm
- 0 votes
0
You can use COUNT with `FILTER( WHERE)

it is called conditional count
```
CREATE TABLE items(item_id int, batch_id int)
```
```
CREATE TABLE
```
```
INSERT INTO items VALUEs(1,NULL),(2,NULL),(3,1)
```
```
INSERT 0 3
```
```
CREATE tABLe batch (batch_id int)
```
```
CREATE TABLE
```
```
select 
       count(*) filter (WHERE batch_id IS NOT NULL ) as "matched"
  ,
         count(*) filter (WHERE batch_id IS NULL ) as "unmatched"
from items 
```
matched unmatched

1 2
```
SELECT 1
```
fiddle
Login or Signup to reply.

- JohnBollinger
- November 10, 2022 at 12:03 am
- 0 votes
0
The count() function seems to be the most likely basic tool here. Given an expression, it returns a count of the number of rows where that expression evaluates to non-null. Given the argument *, it counts all rows in the group.

To the extent that there is a trick, it is getting the batched an unbatched counts in the same result row. There are at least three ways to do that:
- Using subqueries:
```
select
  (select count(batch_id) from items) as batched,
  (select count(*) from items where batch_id is null) as unbatched
-- no FROM
```
That’s pretty straightforward. Each subquery is executed and produces one column of the result. Because no FROM clause is given in the outer query, there will be exactly one result row.
- Using window functions:
```
select
  count(batch_id) over () as batched,
  (count(*) over () - count(batch_id) over ()) as unbatched
from items
limit 1
```
That will compute the batched and unbatched results for the whole table on every result row, one per row of the items table, but then only one result row is actually returned. It is reasonable to hope (though you would definitely want to test) that postgres doesn’t actually compute those counts for all the rows that are culled by the limit clause. You might, for example, compare the performance of this option with that of the previous option.
- Using count() with a filter clause, as described in detail in another answer.
Login or Signup to reply.

matched	unmatched
1	2

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Postgres: counting records in two groups (existing foreign key or null)

Answers