I have two tables, lets call them t1 and t2. Fiddle
select setseed(.42);
create table t1(a,b,c,d)as
select (random()*9)::int
, (random()*9)::int
, (random()*9)::int
, (random()*9)::int
from generate_series(1,100);
create table t2(a,b,c,d)as
select (random()*9)::int
, (random()*9)::int
, (random()*9)::int
, (random()*9)::int
from generate_series(1,200);
I’m trying to filter t2 and count based on my results from a t1 selected query. This would be an inefficient working example (takes half an hour to finish):
SELECT *,
SELECT
(SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A) AS cnt1,
SELECT
(SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A AND t2.B = T1.B) AS cnt2,
SELECT
(SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A AND t2.B = T1.B AND t2.C = T1.C) AS cnt3,
SELECT
(SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A AND t2.B = T1.B AND t2.C = T1.C AND t2.D = T1.D) AS cnt4,
... and so on ...
FROM t1 WHERE A>B AND 1.5>C AND D-E>A
And this is what I’m trying to do using CTE:
SELECT *,
(WITH CTE1 AS (SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A),
CTE2 AS (SELECT COUNT (*) FROM CTE1 WHERE t2.B = T1.B),
CTE3 AS (SELECT COUNT (*) FROM CTE2 WHERE t2.C = T1.C),
CTE4 AS (SELECT COUNT (*) FROM CTE3 WHERE t2.D = T1.D)
SELECT
(SELECT COUNT (*) FROM CTE) AS cnt1,
SELECT
(SELECT COUNT (*) FROM CTE2) AS cnt2,
SELECT
(SELECT COUNT (*) FROM CTE3) AS cnt3,
SELECT
(SELECT COUNT (*) FROM CTE3) AS cnt4
)
FROM t1 WHERE A>B AND 1.5>C AND D-E>A
This only allows to return a single result, ex. cnt1. My guess is that maybe I can pass the result as an array like the example above, but I’m not sure how to achieve this.
SELECT *,
(WITH CTE1 AS (SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A),
CTE2 AS (SELECT COUNT (*) FROM CTE1 WHERE t2.B = T1.B),
CTE3 AS (SELECT COUNT (*) FROM CTE2 WHERE t2.C = T1.C),
CTE4 AS (SELECT COUNT (*) FROM CTE3 WHERE t2.D = T1.D)
ARRAY (SELECT
(SELECT COUNT (*) FROM CTE) ,
SELECT
(SELECT COUNT (*) FROM CTE2),
SELECT
(SELECT COUNT (*) FROM CTE3),
SELECT
(SELECT COUNT (*) FROM CTE3))
) AS CNT[]
FROM t1 WHERE A>B AND 1.5>C AND D-E>A
EDIT: This would be a simplified example of my tables and expected
result.create table t1(id,a,b,c) as values(1,2,1,1), (2,3,2,2), (3,1,3,3), (4,2,4,4), (5,3,5,5), (6,1,2,6), (7,2,2,7), (8,3,2,8), (9,4,2,9), (10,5,4,1); create table t2(id,a,b,c) as values(1,2,1,1), (2,3,2,2), (3,1,3,1), (4,2,4,4), (5,3,5,5), (6,1,2,6), (7,2,2,7), (8,3,2,8), (9,4,3,9), (10,5,4,1), (11,3,2,2), (12,4,4,4), (13,5,6,5), (14,3,2,3), (15,2,2,7), (16,1,2,8), (17,4,1,9), (18,5,2,5), (19,6,3,6), (20,7,4,7);
so updating my faulty query for this example:
SELECT *, (WITH CTE1 AS (SELECT COUNT (*) FROM t2 WHERE t2.A = T1.A), CTE2 AS (SELECT COUNT (*) FROM CTE1 WHERE t2.B = T1.B), CTE3 AS (SELECT COUNT (*) FROM CTE2 WHERE t2.C = T1.C) SELECT (SELECT COUNT (*) FROM CTE) AS cnt1, SELECT (SELECT COUNT (*) FROM CTE2) AS cnt2, SELECT (SELECT COUNT (*) FROM CTE3) AS cnt3 ) FROM t1 WHERE A>B AND C>1.5
2
Answers
The aggregate
filter
clause lets you do that in a single flat query, in a single pass.demo at db<>fiddle
If you’re fine with getting that in rows, not columns, you can make those matches/mismatches your
grouping sets
or arollup
.The use of
grouping()
helps if you want to skip/hide the individual columns.Having
is there to hide the counts of mismatches.Note that if the columns are nullable,
having
would have to change andgrouping()
would be necessary to tell the difference between a grouping set where the column isn’t involved and one where the comparison actually yieldednull
.Refering to your last sample data with expected result (the result seems to be wrong) maybe you could use something like below:
1st – filter t1 to just rows of interest (id 2, 8 and 9)
… then cross join t2 with filtered data and use conditional aggregation to fetch the counts …
R e s u l t :
fiddle
… with randomly generated data it would be like in the fiddle