skip to Main Content

I am making a MySQL query of a table with thousands of records. What I’m really trying to do is find the next and previous rows that surround a particular ID. The issue is that when sorting the table in a specific way, there is no correlation between IDs (I can’t just search for id > $current_id LIMIT 1, for example, because the needed ID in the next row might or might not actually be higher. Here is an example:

ID     Name     Date
4      Fred     1999-01-04
6      Bill     2002-04-02
7      John     2002-04-02
3      Sara     2002-04-02
24     Beth     2007-09-18
1      Dawn     2007-09-18

Say I know I want the records that come directly before and after John (ID = 7). In this case, the ID of the record after that row is actually a lower number. The table is sorted by date first and then by name, but there are many entires with the same date – so I can’t just look for the next date, either. What is the best approach to find, in this case, the row before and (separately) the row after ID 7?

Thank you for any help.

3

Answers


  1. You can use a window function called ROW_NUM in this way. ROW_NUM() OVER(). This will number every row in the table consecutively. Now you search for your I’d and you also get the Row number. For example, you search for ID=7 and you get row number 35. Now you can search for row number from 34 to 36 to get rows below and above the one with I’d 7.

    Login or Signup to reply.
  2. This is what comes to mind:

    SELECT *
    FROM (
        SELECT *, ROW_NUMBER() OVER (ORDER BY `date`, `name`) AS row_num
        FROM people
    ) p1
    WHERE row_num > (SELECT row_num FROM (
        SELECT *, ROW_NUMBER() OVER (ORDER BY `date`, `name`) AS row_num
        FROM people
    ) p2 WHERE p2.id = 7)
    LIMIT 1;
    

    Using the row number window function, you can compare two view of the table with id = 7 and get the row you need. You can change the condition in the subquery to suit your needs, e.g., p2.name = 'John' and p2.date = '2002-04-02'.

    Here’s a dbfiddle demonstrating: https://www.db-fiddle.com/f/mpQBcijLFRWBBUcWa3UcFY/2

    Alternately, you can simplify the syntax a bit and avoid the redundancy using a CTE like this:

    WITH p AS (
        SELECT *, ROW_NUMBER() OVER (ORDER BY `date`, `name`) AS row_num
        FROM people
    )
    SELECT *
    FROM p
    WHERE row_num > (SELECT row_num FROM p WHERE p.id = 7)
    LIMIT 1;
    
    Login or Signup to reply.
  3. As others have suggested you can use window functions for this, but I would use LEAD() and LAG() instead of ROW_NUMBER().

    SELECT *
    FROM (
        SELECT
            *,
            LAG(ID) OVER (ORDER BY `Date` ASC, `Name` ASC) `prev`,
            LEAD(ID) OVER (ORDER BY `Date` ASC, `Name` ASC) `next`
        FROM `tbl`
    ) t
    WHERE `ID` = 7;
    

    With thousands of records (very small) this should be very fast but if you expect it to grow to hundreds of thousands, or even millions of rows you should try to limit the amount of work being done in the inner query. Sorting millions of rows and assigning prev and next values to all of them, just to use one row would be excessive.

    Assuming your example of John (ID = 7) you could use the Date to constrain the inner query. If the adjacent records would always be within one month then you could do something like –

    SELECT *
    FROM (
        SELECT
            *,
            LAG(ID) OVER (ORDER BY `Date` ASC, `Name` ASC) `prev`,
            LEAD(ID) OVER (ORDER BY `Date` ASC, `Name` ASC) `next`
        FROM `tbl`
        WHERE `Date` BETWEEN '2002-04-02' - INTERVAL 1 MONTH AND '2002-04-02' + INTERVAL 1 MONTH
    ) t
    WHERE `ID` = 7;
    

    Without knowing more detail about the distribution of your data, I am only guessing but hopefully you get the idea.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search