I have a log table with timestamped (DATETIME) rows in the millions which I occasionally need to search by date or date range. I have an index on the timestamp column at present for the purpose.
However, I am wondering if it would be possibly to reduce the granularity of this index to, say, minutes or hours rather than seconds as a way of reducing the overall size of the index.
I have investigated prefix indexes, but these do not apply to DATETIME columns, so I am wondering if there is another way to do this. I am happy to trade a little in overall query performance if it can substantially reduce the size of my index.
2
Answers
The size of a
DATETIME
is minimally 5 bytes, plus a variable amount for fractional seconds storage. Ref: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-date-timeMySQL has no data type that represents a "rounded off" datetime value as you describe.
You could store
DATE
values with no time component, for 3 bytes per value.Or you could transform a
DATETIME
to anINT
value, for example 2024082217 (based on 2024-08-22 17:56:00 as I write this). This would use 4 bytes per value.Honestly, I wouldn’t worry about it. A
DATETIME
is already pretty compact as data storage goes.If you find this is causing a problem, then either increase storage to fit the data you need, or else store fewer rows of data.
It’s usually not worth spending a lot of effort for micro-optimizations, because they take time, make your project more complex, and risk creating more bugs.
It’s not a good tradeoff. Even a few hours of a software developer’s time to implement and maintain a custom solution is far more expensive than upgrading your storage.
A range query on an indexed
DATETIME
orTIMESTAMP
is reasonably optimal.For example
INDEX(foo, dt)
will be much less efficient thanINDEX(dt)
.WHERE DATE(dt)
= ‘2024-03-04’is terribly inefficient (even with
INDEX(dt)`) because it is not sargable .If neither of these examples fits your situation, let’s see your query and
SHOW CREATE TABLE
.