skip to Main Content

In the application I will be processing data from a very large table in MySQL 5.7. I will be retrieving data about 2000 rows at a time.

The rows are linked with each other using group_id column – up to 3 rows in a group. So when I page over data I don’t want to break the grouping, because the data does not make sense unless I have all the rows from the group.

id   group_id   data
---------------------------
1    25         item data
2    26         item data
3    26         item data
4    27         item data
5    27         item data
6    27         item data
7    29         item data
8    33         item data
9    33         item data

And I would retrieve data using a query similar to this.

select id, grouping_id, data
from table
limit <start>, 2000

Any thoughts on how I can page over data while retrieving whole groups?

2

Answers


  1. Chosen as BEST ANSWER

    What I ended up doing is paging over distinct group_id values and doing self join to get rows in each selected group.

    select t.id, t.group_id, t.data from table t
    right join (
        select group_id 
        from table
        group by group_id -- or select distinct
        limit <start>, 1000
    ) t_
    on t.group_id = t_.group_id;
    

    This way I am paging over group_ids instead of just rows.

    Given that each group can contain up to 3 rows, in above example the number of rows retrieved per page would range from 1000 (if each group contains 1 row) up to maximum of 3000 rows (if each group contains 3 rows).


  2. Let see how we can solve this :

    • ROW_NUMBER() we can find row/ count row number by group id
    • PARTITION BY used for to reset above row number and count for new one
    • WITH, common table expression we will select form group data RowWithChunk
    
    WITH RowWithChunk AS (
        SELECT
            group_id,
            data,
            ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY group_id) AS row_num
        FROM
            table
    )
    SELECT
        group_id,
        data,
        row_num
    FROM
        RowWithChunk
    WHERE
        row_num <= 3
    
    order by group_id
    
    limit 0, pageSize;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search