So my colleague created this query which will run every hour on a table with 500K+ records.
Delete from table where timestamp> now() - interval 24 hour
I am having a feeling that this would be slower as it is computing time at each row, am I right? How can I optimize it?
Update
With 2.8 Million records it took around 12 seconds to delete the matched rows.
2
Answers
No, the time calculation is done once at the start of the query. It is a constant value for the duration of the query.
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_now says:
https://dev.mysql.com/doc/refman/8.0/en/where-optimization.html says:
You also asked:
The easiest thing to do is make sure there is an index on the
timestamp
column.A different solution is to use partitioning by the
timestamp
column, and drop 1 partition per day. This blog has a description of this solution: http://mysql.rjweb.org/doc.php/partitionmaintPARTITION BY RANGE
and useDROP PARTITION
; suggest by hour; see Partition