skip to Main Content

I’m trying to update rows in a scores table based on the following logic:

  • Get the feat_sum for ids that don’t have a score.
  • For each feat_sum that has a NULL score, get the row with the nearest feat_sum and score and then update the score field to that score.
  • If feat_sum difference is identical, chose the smaller score
  • id is the PK of the table

Initial table:

scores

| id  | feat_sum | score |
| --- | ---      | ---   |
| 1   | 1.234    | 341   |
| 2   | 5.678    | 758   |
| 3   | 2.234    |  NULL |
| 4   | 8.678    |  NULL |

Expected output after query:

scores

| id  | feat_sum | score |
| --- | ---      | ---   |
| 1   | 1.234    | 341   |
| 2   | 5.678    | 758   |
| 3   | 2.234    | 341   | 
| 4   | 8.678    | 758   |

e.g. 1.234 is closer to 2.234 than 5.678 is to 2.234, therefore, the score for 2.234 should be 341.

I think I’ve got the base query here, but I’m struggling to put the last bit together.

SELECT 
  id,
  feat_sum,
  CASE 
    WHEN score IS NULL
    THEN (SELECT score FROM scores WHERE feat_sum - some_other_feat_sum /* has smallest difference */
  END AS score
FROM scores;

3

Answers


  1. My thought process is as follows,

    1. get the rows with score = NULL
    2. calculate the difference of the rows above with all other rows and rank them based on difference, score
    3. join the rows with least difference with original table to get the score
    4. and finally a left join which conditionally shows the scores.
    with noscore as 
    (
        select * 
        from Table1 
        where score is null
    ),
    alldiff as 
    (
        select t1.id, t2.id as diffid, abs(t2.feat_sum-t1.feat_sum) as diff, t2.score 
        from noscore t1 
        inner join Table1 t2 on  
        t1.id != t2.id and t2.score is not null 
        order by diff,score asc
    ),
    diff as 
    (
        select *, row_number() over (partition by id order by diff asc) as nr 
        from alldiff
    ),
    mindiff as 
    (
        select df.id, t1.feat_sum, t1.score 
        from diff df 
        inner join Table1 t1 on df.diffid = t1.id
        where df.nr = 1
    ),
    updt as 
    (
        select t1.id, t1.feat_sum, if(t1.score is null,md.score,t1.score) as score 
        from Table1 t1 
        left join mindiff md on t1.id = md.id
    )
    update Table1 t1 inner join updt u on t1.id = u.id set t1.score = u.score;
    

    There may be a scope for query optimisation but I guess this works for now.

    Try it out here

    Login or Signup to reply.
  2. I SQL Server, I’d use an APPLY operation, like this:

    SELECT s.id, s.feat_sum, COALESCE(s.score, alt.score) as score
    FROM scores s
    CROSS APPLY (
        SELECT TOP 1 score 
        FROM scores s0
        WHERE s0.score IS NOT NULL
        ORDER BY abs(s0.feat_sum - s.feat_sum)
    ) alt
    

    Other databases call this a lateral join. I know MySQL supports this, but the documentation is not clear to me (it only shows the old bad A,B syntax), so this might not be quite right:

    SELECT s.id, s.feat_sum, COALESCE(s.score, alt.score) as score
    FROM scores s
    JOIN LATERAL (
        SELECT score 
        FROM scores s0
        WHERE s0.score IS NOT NULL
        ORDER BY abs(s0.feat_sum - s.feat_sum)
        LIMIT 1
    ) alt
    

    Most lateral joins that also use a LIMIT 1 can be re-written to run even faster using a windowing function instead. I haven’t looked that far ahead yet on this query.

    Login or Signup to reply.
  3. In the UPDATE statement join to the table a CTE that returns for each id with a score that is null the closest score by utilizing the window function FIRST_VALUE():

    WITH cte AS (
      SELECT DISTINCT s1.id, 
             FIRST_VALUE(s2.score) OVER (PARTITION BY s1.id ORDER BY ABS(s2.feat_sum - s1.feat_sum), s2.score) AS score
      FROM scores s1 JOIN scores s2
      WHERE s1.score IS NULL AND s2.score IS NOT NULL
    )
    UPDATE scores s 
    INNER JOIN cte c ON c.id = s.id
    SET s.score = c.score;
    

    See the demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search