There are numerous SQL questions and answers about finding gaps in sequences, but I could not find any that have the wrinkle my situation has: I need to group the sequence by another column.
My table (simplified)
id | parent_id | list_index
1 | 987 | 0
2 | 987 | 1
3 | 987 | 2
4 | 654 | 0
5 | 654 | 2
7 | 321 | 1
8 | 321 | 2
9 | 321 | 4
10 | 321 | 5
I added the empty lines to visualize the grouping.
The business rule is that for any parent_id
there should be a set of rows with list_index
values in a contiguous sequence starting at 0
. However, the data has gaps and it’s those gaps I need to find. Specifically, I need to identify the parent_id
s that have a gap.
In the sample data above, I would need the query to identify parent ids 654
and 321
because 654
is missing a row with list_index
1 and parent 321
is missing rows with list_index
0 and 3. Parent id 987
should not be included because it’s rows have a contiguous sequence.
I do not care what the gap looks like (eg, where it starts or ends), just which parentid
s are missing 1 or more rows.
3
Answers
Gaps in your case can be considered separately for each group (parent_id).
Then the task becomes a regular gap-finding task again.
See examples.
Output is
We can use math and partitioning to solve this.
DBFiddleUk Demo
Premise: use a window function Row Number() and order by your list index and identify records where the row_number assigned-1 isn’t equal to the list index.
CTE- your data
CTE2- your data with a row_numer assigned
Final query against CTE2 gives you the group and a count of records which don’t match.
Giving us: