skip to Main Content

The following is the setup code:

DROP TEMPORARY TABLE IF EXISTS temp_table1;
DROP TEMPORARY TABLE IF EXISTS temp_table2;

CREATE TEMPORARY TABLE temp_table1 (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `doc_no` varchar(25) NOT NULL,
  `other_table_id` int DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1;

CREATE TEMPORARY TABLE temp_table2 (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `doc_no` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1;

INSERT INTO temp_table1 (doc_no)
VALUES ('100/1/23-24'), ('100-1-2324');

INSERT INTO temp_table2 (id, doc_no)
VALUES (101, '10012324'), (102, '100/1/2324'), (103, '100/1-23/24');


SET SQL_SAFE_UPDATES = 0;

Here, I have deliberately given similar (but not exactly equal) values for doc_no in all the rows of both the tables.

Now, coming to the question:

I want to UPDATE each row of temp_table1 with the id of each row of temp_table2 based on the row sequence and based on the other matching conditions in the WHERE clause.

For example, update row 1 of temp_table1 with the id of row 1 of temp_table2 i.e. 101, update row 2 of temp_table1 with the id of row 2 of temp_table2 i.e. 102, and so on wherever the doc_no matches after removing the special characters.

As you can see, temp_table2 has 1 extra row, that row’s id should not go anywhere in temp_table1.

If I run:

UPDATE temp_table1 T0
INNER JOIN temp_table2 T1
ON RemoveSpecialCharacters(T0.doc_no) = RemoveSpecialCharacters(T1.doc_no)
SET T0.other_table_id = T1.id;

then, the output in temp_table1 is:

id  doc_no      other_table_id
1   100/1/23-24 101
2   100-1-2324  101

whereas, required output is:

id  doc_no      other_table_id
1   100/1/23-24 101
2   100-1-2324  102

In short, when multiple rows of temp_table1 match with multiple rows of temp_table2, I want to match them based on row sequence.

What is the way to do this?

2

Answers


  1. A quick and dirty solution using ROW_NUMBER to check the order of the row. Unfortunately we can’t utilize Window Function directly inside Update statement hence we need a "reference list." The following query creates reference list called temp_joined for the role.

    WITH temp2_numbered AS
       (
         SELECT *,
          ROW_NUMBER()
             OVER (PARTITION BY
                   RemoveSpecialCharacters(T.doc_no)
                   ORDER BY T.id
       ) AS rc
        FROM temp_table2 AS T),
       temp1_numbered AS
       (
         SELECT *,
          ROW_NUMBER()
             OVER (PARTITION BY
                   RemoveSpecialCharacters(T.doc_no)
                   ORDER BY T.id
        ) AS rc
        FROM temp_table1 AS T),
        temp_joined AS
        (
          SELECT T0.id AS T0id, T1.id AS T1id
          FROM temp1_numbered AS T0
            LEFT JOIN temp2_numbered AS T1
            ON RemoveSpecialCharacters(T0.doc_no)
               = RemoveSpecialCharacters(T1.doc_no)
            AND T0.rc = T1.rc
        )
    UPDATE temp_table1 AS T0
    LEFT JOIN temp_joined AS T1
       ON T0.id
          = T1.T0id
    SET T0.other_table_id = T1.T1id;
    

    This will output:

    id doc_no other_table_id
    1 100/1/23-24 101
    2 100-1-2324 102

    There are some gotchas to the solution. For one, if temp_table1 has more row than temp_table2 then it gives you null. The solution also only works on 8+. There might be a better way to solve the problem.

    See fiddle: https://www.db-fiddle.com/f/9BFhvFou6cbKzdNZRbbWZU/0

    Login or Signup to reply.
  2. Because you want to compare stuff in row_number, you might want to add that column to temp_table1

    ALTER TABLE temp_table1 ADD COLUMN R INT NULL;
    
    -- The `old` (before 8.0) way to initialize the new column:
    SET @i:=0; 
    UPDATE temp_table1  SET R=(@i:=@i+1);
    
    CREATE FUNCTION RemoveSpecialCharacters(s VARCHAR(45)) RETURNS VARCHAR(45) DETERMINISTIC
    BEGIN
       RETURN REPLACE(REPLACE(s,'/',''),'-','');
    END;
    
    SELECT * FROM temp_table1;
      
    UPDATE temp_table1 t1
    INNER JOIN (select *, row_number() over (order by id) as R 
                   from temp_table2) t2 on t1.R=t2.R
                                       and RemoveSpecialCharacters(t1.doc_no) = RemoveSpecialCharacters(t2.doc_no)
    SET other_table_id = t2.id
    

    The result is:

    id doc_no other_table_id
    1 100/1/23-24 101
    2 100-1-2324 102

    see: DBFIDDLE

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search