I have a table that contains students publications like this
id | student |
---|---|
1 | john |
2 | anthony |
3 | steven |
4 | lucille |
5 | anthony |
6 | steven |
7 | john |
8 | lucille |
9 | john |
10 | anthony |
11 | steven |
12 | lucille |
13 | john |
so the idea is about to have a query that fetchs all ordered occurences of a determinated student names
context :
answer to the question : how many times John is publishing just after Anthony (who is publishing just after Steven …) and get id of each occurence
example :
If I look for all occurences of [john, anthony]
I’ll get (note that the ids must be successive for each occurence)
id | student |
---|---|
1 | john |
2 | anthony |
9 | john |
10 | anthony |
Or :
id | — comment |
---|---|
1 | (id of first occurence of john, anthony) |
9 | (id of second occurence of john, anthony) |
If I look for [anthony, steven, lucille]
i’ll get
id | student |
---|---|
2 | anthony |
3 | steven |
4 | lucille |
10 | anthony |
11 | steven |
12 | lucille |
Or :
id | — comment |
---|---|
2 | (id of first occurence of anthony, steven, lucille) |
10 | (id of second occurence of anthony, steven, lucille) |
Any ideas or leads to help me move forward?
2
Answers
Start with this and if it explodes we’ll do some performance improvements, with the price of making the code a little bit more complicated.
Fiddle
That should do the trick, performance wise.
The main idea is to split the data by the first student that is in our search list, but not in all places –
Since the same student can appear multiple times in our search list, we need to make sure that we’re not breaking the pattern in the middle.
We’re doing that by verifying that each occurrence of the first student is far enough from its previous occurrence, that is, the distance between the two occurrences is bigger than the search list length (the number of non-unique students’ names within the search list)
This is the result for an extended data sample:
Fiddle