I’m trying to update rows in a scores
table based on the following logic:
- Get the
feat_sum
for ids that don’t have ascore
. - For each
feat_sum
that has a NULLscore
, get the row with the nearestfeat_sum
andscore
and then update thescore
field to that score. - If
feat_sum
difference is identical, chose the smaller score - id is the PK of the table
Initial table:
scores
| id | feat_sum | score |
| --- | --- | --- |
| 1 | 1.234 | 341 |
| 2 | 5.678 | 758 |
| 3 | 2.234 | NULL |
| 4 | 8.678 | NULL |
Expected output after query:
scores
| id | feat_sum | score |
| --- | --- | --- |
| 1 | 1.234 | 341 |
| 2 | 5.678 | 758 |
| 3 | 2.234 | 341 |
| 4 | 8.678 | 758 |
e.g. 1.234 is closer to 2.234 than 5.678 is to 2.234, therefore, the score
for 2.234 should be 341.
I think I’ve got the base query here, but I’m struggling to put the last bit together.
SELECT
id,
feat_sum,
CASE
WHEN score IS NULL
THEN (SELECT score FROM scores WHERE feat_sum - some_other_feat_sum /* has smallest difference */
END AS score
FROM scores;
3
Answers
My thought process is as follows,
There may be a scope for query optimisation but I guess this works for now.
Try it out here
I SQL Server, I’d use an APPLY operation, like this:
Other databases call this a lateral join. I know MySQL supports this, but the documentation is not clear to me (it only shows the old bad A,B syntax), so this might not be quite right:
Most lateral joins that also use a
LIMIT 1
can be re-written to run even faster using a windowing function instead. I haven’t looked that far ahead yet on this query.In the
UPDATE
statement join to the table a CTE that returns for eachid
with ascore
that isnull
the closestscore
by utilizing the window functionFIRST_VALUE()
:See the demo.