I have two MySQL tables.
One is a large list of entries which is basically the main table. The second is a table of pending entries, that will ultimately get added to the large list of entries if they’re approved (or just get discarded without any change to the main table if not).
Both tables have an auto increment primary key and the exact same structure for all the other columns.
I use the primary key on the pending entries table for edits of the pending entries, but it doesn’t matter beyond that point once it’s approved.
I’d like to do an insert similar to this:
INSERT INTO `main_table` SELECT * FROM `pending_table`
It’s super simple and easy! However, of course, there is an error:
Error! Duplicate entry ‘1’ for key ‘PRIMARY’
At the moment, I’m using the following query:
INSERT INTO `main_table` (`col1`, `col2`, ..., `coln`)
SELECT `col1`, `col2`, ..., `coln` FROM `pending_table`
This can also work and is a bit shorter:
INSERT INTO `main_table`
SELECT NULL, `col1`, `col2`, ..., `coln` FROM `pending_table`
col1
, col2
, …, coln
represent all the columns except the primary key column, and I have to list them out one by one.
Of course, that query immediately breaks if any of the columns in the table ever change, and there’s a really good chance I will want to add new columns in the future. It’s pretty easy to add the same column to both tables. What I’d like to avoid is having to go into the PHP file(s) and manually add the column there as well in every place that it’s used.
What I would like instead is to set so the primary keys on the pending_table
will be discarded and replaced with unique primary keys that are unused in main_table
. All the other columns will copy across. I haven’t been able to find any option to do this.
- The REPLACE command doesn’t help because that’s going to overwrite all the entries that already exist. I also couldn’t find anything on the DUPLICATE KEY UPDATE option of INSERT.
- I found a question here that might be asking something similar but I can’t understand the solution proposed and I’m hoping for something simple. This one is just a generic explanation of the problem potentially. This just seems completely different as does this one. This guy just made a mistake and he’s already doing my "solution" above. I haven’t been able to find any other questions that seem related.
- I can make code to check the table structure and produce a dynamic column list. However, that just seems like a large kludge and I’m hoping to find a simpler and more elegant solution without adding/maintaining/running a large block of extra PHP code.
- I also thought of dropping the primary key column from the pending table and then reconstructing the whole table afterward, but that also feels kind of kludgey. I’m also not sure it would work because the column count would be different.
- Similarly, manually updating the primary key column by reading the main table ahead of time could work but also feels really kludgy. And it could run into problems (race condition) if some other process adds a row to the main table while those keys are being assigned.
Is there any simpler solution, or am I already doing the simplest solution?
Thanks for any help/ideas.
2
Answers
I did find one potential solution.
Rather than deleting the primary key column, I can remove the auto increment and index as follows.
id
needs to be the name of the primary key column and the type has to match:From here, the primary key field needs to be set to NULL for the insert.
Now, the insert will work without any error from duplicate primary keys.
Then, we just need to restore the primary key in the pending table with this query:
It's not the simplest or most elegant solution in the world but it's all SQL, works regardless of the column structure, and it doesn't require any extra PHP to be maintained or updated.
Note that the pending table keys could be changed in the process.
No, as you’ve discovered it is not.
Leaving aside the primary key issue, this method (
INSERT INTO table SELECT *...
) is dependent on both tables having the SAME attributes, in the same order AND honouring the order of attributes within the table (which IIRC is not part of standard SQL). So while having to enumerate the attributes from both tables could result in undesirable consequences if the table structure changes, in the absence of a primary key there are several other things which can go wrong.It avoids the specific issue you encountered. But is not a good solution.
OTOH as presented it is an accident waiting to happen. There are no concurrency controls in your code to manage the safe transition of data from one table to the other. With concurrent access, there are race conditions where you will lose data.
As usual, the RIGHT way to fix this is to normalize your data – as per @P.Salmon’s comment above.
IMHO the issues arising from assuming
SELECT *
are worse than those from enumerating attributes. UsingSELECT *
also has a performance cost (cache flushing, no covering indexes). So as a general rule, I always avoidSELECT *
.