Postgresql - Conditions in aggregate functions - PhpOut

myfakeaccount
August 4, 2024
70 views
0 votes
2 Answers

I was solving some SQL tasks and got confused with the following thing:

CREATE TABLE data (
  len INT
);

INSERT INTO data (len)
VALUES 
  (NULL),
  (100), 
  (200),
  (300),
  (400);

suppose we’ve created such a table and we wanna count all values that are less than 200. I am using this online compiler to do it.

my first attempt gives 1 as expected

SELECT COUNT(*) FILTER (WHERE len < 200)
FROM data

then I thought whether I can simplify it and just do

SELECT COUNT(len < 200) FROM data

but it gives 4 as an answer.

So my question is: can we use logic expressions inside of aggregate functions? If the answer is NO, why?

Tags: postgresql sql

Answers

- JohnH
- August 3, 2024 at 7:39 pm
- 0 votes
0
COUNT(expression) computes the number of rows for which expression is not NULL. COUNT(*) computes the total number of rows because * is the row value, which is never NULL for any row, even if every column in a row is NULL. The < operator can return one of three results: TRUE, FALSE, or NULL. Keep in mind that NULL is not a value, it is a state. The expression COUNT(len < 200) only returns NULL when len is NULL; therefore, any valued len will contribute to the count. COUNT(NULL) will always return 0 regardless of the number of rows.

Login or Signup to reply.

- ErwinBrandstetter
- August 4, 2024 at 10:09 am
- 0 votes
0
count(*) counts all input rows regardless of their contents. Period. That’s according to the SQL standard, and does not depend on the meaning of the symbol * in other contexts.

Postgres has a separate (faster) implementation for count(*). The manual:

count ( * ) → bigint

Computes the number of input rows.

count ( "any" ) → bigint

Computes the number of input rows in which the input value is not
null.

The definition of the variant count("any") also answers why your 2nd query counts all rows with not-null len.

To actually simplify your query – and make it faster, or dramatically faster for big tables and selective criteria matching an index:
```
SELECT count(*) FROM data WHERE len < 200;
```
See:
- For absolute performance, is SUM faster or COUNT?
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.