I understand that this query will produce numbers and cumulative tallies:
SELECT number
, SUM(number) OVER (ORDER BY number) cumulative_number
FROM (SELECT 1 number UNION SELECT 2 UNION SELECT 3) inside
GROUP BY number
Result:
number | cumulative_number |
---|---|
1 | 1 |
2 | 3 |
3 | 6 |
And I sort of get how this function works. There are partitions constructed by my GROUP BY
clause "GROUP BY number
". My partitions are number
= 1, number
= 2 and number
= 3. And for example in the number
= 2 case, somehow the SUM()
function is magically accessing rows outside its partition which is somehow designated by my window_spec
order_clause
"ORBER BY number
".
👀 But what I don’t understand is how the MySQL documentation of ORDER BY in this context has anything to do with anything.
order_clause: An ORDER BY clause indicates how to sort rows in each partition. Partition rows that are equal according to the ORDER BY clause are considered peers. If ORDER BY is omitted, partition rows are unordered, with no processing order implied, and all partition rows are peers.
A partition is a set. Sets are unsorted according to first-grade math. So sorting something inside a partition is already tripping my BS detector. Even if you tell me that sets are ordered, the sum function I learned in first grade certainly does not care what order its operands are sorted.
Extending this misunderstanding to another concrete example:
SELECT ROUND(number, -1) tens
, SUM(COUNT(*)) OVER (ORDER BY number) cumulative_number
FROM (SELECT 1 number UNION SELECT 2 UNION SELECT 11 UNION SELECT 12) inside
GROUP BY tens
Assuming that MySQL documentation means something useful and is not just wrong words. Let’s consider how ORDER BY
applies here by number
s. The 11 does not equal the 12, they are different and so an ORDER BY
differentiates them. They are both in the tens
= 10 partition. So magically somehow the SUM()
function needs to pick the sum of 3 or the sum of 4 expression result from the 11 or 12 row from inside the tens
= 10 partition.
👀 How does it pick the result it does (spoiler: it picks the sum of 4)?
And also…
👀 What fundamental wisdom in the MySQL documentation am I failing to appreciate, or how is that documentation deficient?
2
Answers
In the query:
Since you didn’t use a
PARTITION BY
clause in the window function, the window function has a single partition. That is, it considers all the resulting rows of the aggregation as a single partition.The aggregation returns three rows:
1
2
3
Then, the window function
SUM()
computes the running sum of it (1, 3, 6). There’s no magic in it.Window functions allow to show individual values and aggregate values at same time. Maybe it is the
GROUP BY number
that is confusing you.The query:
Produces the same result as the query:
The wndow function treats all rows as one partition (unless you use
PARTITION BY
). It does aggregation for each row. TheSUM(number) OVER (ORDER BY number)
applies to all rows. As the query has only one normal column and a window function, theGROUP BY
will do nothing.