I have a series of bird species names that are sorted taxonomically (by ID). I would like to cluster sub-species (which currently have a clust value of 0) with the species (clust <> 0) that falls above them in the table.
Here is a subset of the data and below is the desired query output to update the clust with newclust
Table Data
ID | Clust | code |
---|---|---|
2 | 2 | ostric2 |
3 | 0 | comost1 |
4 | 0 | comost2 |
5 | 0 | comost3 |
6 | 0 | comost4 |
7 | 3 | ostric3 |
8 | 0 | y00934 |
10 | 4 | grerhe1 |
11 | 0 | grerhe2 |
12 | 0 | grerhe3 |
13 | 0 | grerhe4 |
14 | 0 | grerhe5 |
15 | 0 | grerhe6 |
16 | 5 | lesrhe2 |
17 | 6 | lesrhe4 |
18 | 0 | lesrhe1 |
19 | 0 | lesrhe5 |
20 | 7 | lesrhe3 |
I’d like to fill in the 0 values in clust with the number (ID of the species) directly above them in the table.
Desired query output
ID | Clust | code | NewClust |
---|---|---|---|
2 | 2 | ostric2 | 2 |
3 | 0 | comost1 | 2 |
4 | 0 | comost2 | 2 |
5 | 0 | comost3 | 2 |
6 | 0 | comost4 | 2 |
7 | 3 | ostric3 | 3 |
8 | 0 | y00934 | 3 |
10 | 4 | grerhe1 | 4 |
11 | 0 | grerhe2 | 4 |
12 | 0 | grerhe3 | 4 |
13 | 0 | grerhe4 | 4 |
14 | 0 | grerhe5 | 4 |
15 | 0 | grerhe6 | 4 |
16 | 5 | lesrhe2 | 5 |
17 | 6 | lesrhe4 | 6 |
18 | 0 | lesrhe1 | 6 |
19 | 0 | lesrhe5 | 6 |
20 | 7 | lesrhe3 | 7 |
I tried a left join thinking that it would give me access to all of the records in the a table to find that record where a.id is < b.id and a.clust is non-zero.
select b.id, b.clust, b.code, (select max(a.id) where a.clust<>0 and a.id<b.id)
from db as a
left join db as b on a.id=b.id;
The query gives me a null value for Newclust whether I use max or not. I suspect it is due to forcing a.id=b.id. I can see how to do this with a procedure-oriented programming language but would like to use a mysql query.
Thanks
2
Answers
See example
fiddle