The following SQL works well; it gives me the result I want for the counts and percentage: (please use EXTRACT (YEAR|MONTH|DAY …) where required):
SELECT YEAR(evt_date) AS yr, MONTH(evt_date) AS mth, DAY(evt_date) AS dy, COUNT(*) AS cnt,
ROUND(COUNT(*)*100 / (SUM(COUNT(*)) OVER (PARTITION BY YEAR(evt_date), MONTH(evt_date))), 2) AS prct
FROM my_data
WHERE evt_date >= '2023-11-01'
GROUP BY yr, mth, dy
ORDER BY 1,2,3
;
But if I introduce the ROLLUP in, I get the cnt totals fine but not the totals for percentages where the window expression is used:
SELECT YEAR(evt_date) AS yr, MONTH(evt_date) AS mth, DAY(evt_date) AS dy, COUNT(*) AS cnt,
ROUND(COUNT(*)*100 / (SUM(COUNT(*)) OVER (PARTITION BY YEAR(evt_date), MONTH(evt_date))), 2) AS prct
FROM my_data
WHERE evt_date >= '2023-11-01'
GROUP BY yr, ROLLUP (mth, dy) -- ROLLUP does not give the the result I expect to see for the percentage column
ORDER BY 1,2,3
;
Does anybody knows why? I am using Postgresql v12.
I have this alternative solution with tricks, but it is weird the one above doesn’t work:
WITH my_data(evt_date, foo) AS
(SELECT '2023-11-01'::DATE, 'X' UNION
SELECT '2023-11-01', 'Y' UNION
SELECT '2023-11-02', 'X' UNION
SELECT '2023-11-02', 'Y' UNION
SELECT '2024-01-01', 'X' UNION
SELECT '2024-01-01', 'Y' UNION
SELECT '2024-01-01', 'Z' UNION
SELECT '2024-01-02', 'X' UNION
SELECT '2024-01-02', 'Y' UNION
SELECT '2024-01-03', 'X' UNION
SELECT '2024-01-03', 'Y' UNION
SELECT '2024-01-03', 'Z' UNION
SELECT '2024-01-03', 'W'
),
totals AS
(SELECT YEAR(evt_date) AS yr, MONTH(evt_date) AS mth, DAY(evt_date) AS dy, COUNT(*) AS cnt,
ROUND(COUNT(*)*100 / (SUM(COUNT(*)) OVER (PARTITION BY YEAR(evt_date), MONTH(evt_date))), 2) AS prct
FROM my_data
WHERE evt_date >= '2023-11-01'
GROUP BY yr, mth, dy
ORDER BY 1,2,3
)
SELECT yr, mth, dy, SUM(cnt) AS cnt, SUM(prct) AS prct --<-- Fake SUM() to be able to use ROLLUP
FROM totals
GROUP BY yr, ROLLUP (mth, dy) --- Only way? for ROLLUP to handle the percentage column well
ORDER BY 1,2,3
2
Answers
The behaviour of ROLLUP is to generate subtotals along with grand totals for the column on which it is specified, but it doesn’t work well with the window functions because Rollup introduces additional rows to the result which can affect the calculation of the window functions.
Possible Solution : Try using a subquery to calculate the total count for each group before applying the window function
See example.
First,
prct
calculation can be beforeROLLUP
.And, You calculate sum()over(partition by
yr,mth
), therefore ROLLUP can be (dy
).Output
With partition by
yr
and rollup(mth,dy
)output
I do not know what the point is here.