In the application I will be processing data from a very large table in MySQL 5.7. I will be retrieving data about 2000 rows at a time.
The rows are linked with each other using group_id
column – up to 3 rows in a group. So when I page over data I don’t want to break the grouping, because the data does not make sense unless I have all the rows from the group.
id group_id data
---------------------------
1 25 item data
2 26 item data
3 26 item data
4 27 item data
5 27 item data
6 27 item data
7 29 item data
8 33 item data
9 33 item data
And I would retrieve data using a query similar to this.
select id, grouping_id, data
from table
limit <start>, 2000
Any thoughts on how I can page over data while retrieving whole groups?
2
Answers
What I ended up doing is paging over distinct
group_id
values and doing self join to get rows in each selected group.This way I am paging over
group_id
s instead of just rows.Given that each group can contain up to 3 rows, in above example the number of rows retrieved per page would range from 1000 (if each group contains 1 row) up to maximum of 3000 rows (if each group contains 3 rows).
Let see how we can solve this :
ROW_NUMBER()
we can find row/ count row number by group idPARTITION BY
used for to reset above row number and count for new oneWITH
, common table expression we will select form group dataRowWithChunk