skip to Main Content

I have two MySQL tables.

One is a large list of entries which is basically the main table. The second is a table of pending entries, that will ultimately get added to the large list of entries if they’re approved (or just get discarded without any change to the main table if not).

Both tables have an auto increment primary key and the exact same structure for all the other columns.

I use the primary key on the pending entries table for edits of the pending entries, but it doesn’t matter beyond that point once it’s approved.

I’d like to do an insert similar to this:

INSERT INTO `main_table` SELECT * FROM `pending_table`

It’s super simple and easy! However, of course, there is an error:

Error! Duplicate entry ‘1’ for key ‘PRIMARY’

At the moment, I’m using the following query:

INSERT INTO `main_table` (`col1`, `col2`, ..., `coln`)
SELECT `col1`, `col2`, ..., `coln` FROM `pending_table`

This can also work and is a bit shorter:

INSERT INTO `main_table`
SELECT NULL, `col1`, `col2`, ..., `coln` FROM `pending_table`

col1, col2, …, coln represent all the columns except the primary key column, and I have to list them out one by one.

Of course, that query immediately breaks if any of the columns in the table ever change, and there’s a really good chance I will want to add new columns in the future. It’s pretty easy to add the same column to both tables. What I’d like to avoid is having to go into the PHP file(s) and manually add the column there as well in every place that it’s used.

What I would like instead is to set so the primary keys on the pending_table will be discarded and replaced with unique primary keys that are unused in main_table. All the other columns will copy across. I haven’t been able to find any option to do this.

Is there any simpler solution, or am I already doing the simplest solution?

Thanks for any help/ideas.

2

Answers


  1. Chosen as BEST ANSWER

    I did find one potential solution.

    Rather than deleting the primary key column, I can remove the auto increment and index as follows. id needs to be the name of the primary key column and the type has to match:

    ALTER TABLE `pending_table` DROP PRIMARY KEY,
    CHANGE `id` `id` int(11)
    

    From here, the primary key field needs to be set to NULL for the insert.

    UPDATE `pending_table` SET `id` = NULL
    

    Now, the insert will work without any error from duplicate primary keys.

    INSERT INTO `main_table` SELECT * FROM `pending_table`
    

    Then, we just need to restore the primary key in the pending table with this query:

    ALTER TABLE `pending_table` ADD PRIMARY KEY (`id`),
    CHANGE `id` `id` int(11) AUTO_INCREMENT
    

    It's not the simplest or most elegant solution in the world but it's all SQL, works regardless of the column structure, and it doesn't require any extra PHP to be maintained or updated.

    Note that the pending table keys could be changed in the process.


  2. It’s super simple and easy!

    No, as you’ve discovered it is not.

    Leaving aside the primary key issue, this method (INSERT INTO table SELECT *...) is dependent on both tables having the SAME attributes, in the same order AND honouring the order of attributes within the table (which IIRC is not part of standard SQL). So while having to enumerate the attributes from both tables could result in undesirable consequences if the table structure changes, in the absence of a primary key there are several other things which can go wrong.

    I did find one potential solution.

    It avoids the specific issue you encountered. But is not a good solution.

    OTOH as presented it is an accident waiting to happen. There are no concurrency controls in your code to manage the safe transition of data from one table to the other. With concurrent access, there are race conditions where you will lose data.

    As usual, the RIGHT way to fix this is to normalize your data – as per @P.Salmon’s comment above.

    IMHO the issues arising from assuming SELECT * are worse than those from enumerating attributes. Using SELECT * also has a performance cost (cache flushing, no covering indexes). So as a general rule, I always avoid SELECT *.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search