I have a table which links the IDs of war memorials with the IDs of records of soldiers. It has a unique index based on the combination of the two IDs (sample data below).
I have recently started storing the source of these connections, typically a URL. What I want to do is remove all lines that have no source except those where I still have no source stored.
By way of example
id | memorialId | soldierId | source
1 | 1 | 1 | Source 1
2 | 1 | 1 | Source 2
3 | 1 | 1 |
4 | 2 | 1 | Source 3
5 | 2 | 1 |
6 | 3 | 2 | Source 4
7 | 4 | 3 |
8 | 5 | 4 | Source 1
9 | 5 | 4 | Source 2
This should end up as
memorialId | soldierId | source
id | memorialId | soldierId | source
1 | 1 | 1 | Source 1
2 | 1 | 1 | Source 2
4 | 2 | 1 | Source 3
6 | 3 | 2 | Source 4
7 | 4 | 3 |
8 | 5 | 4 | Source 1
9 | 5 | 4 | Source 2
Line 3 is deleted because there is no source, and lines 1 & 2 have the same connection but with sources; similarly Line 5 because Line 4 has a source for that connection.
Line 6 remains as it is the only line defining that connection.
Line 7 remains for the same reason, even though it has no source.
Lines 8 & 9 are both kept as it’s two distinct sources for the same connection – I only want to delete lines where the source is empty.
The closest I have got so far is a query that identifies which lines need to be deleted
SELECT id,memorialId,soldierId,source FROM myTable
GROUP BY memorialId,soldierId
HAVING COUNT(*) > 1 AND min(source) = '';
This identifies the two lines for deletion, but if I then wrap that in a delete statement like
Delete from
myTable where exists (select 1 FROM myTable
GROUP BY memorialId,soldierId
HAVING COUNT(*) > 1 AND min(source) = '');
I get a message saying that I can’t update the same table that’s in the FROM clause.
I’m probably missing something obvious as I only have basic knwowledge of mysql I’m afraid.
2
Answers
https://dev.mysql.com/doc/refman/en/subquery-restrictions.html says:
But you can do joins in
UPDATE
andDELETE
, and reference the table more than once.I’d do it this way:
The join is satisfied only if
t1
has a blank source, and at least one rowt2
does have the same id’s and does have a non-blank source.Demo: https://www.db-fiddle.com/f/gyyTPJiRmhzWh4gSFgbqA4/0
You will need to use it as an implicit temp table try: