skip to Main Content

I have a log table with timestamped (DATETIME) rows in the millions which I occasionally need to search by date or date range. I have an index on the timestamp column at present for the purpose.

However, I am wondering if it would be possibly to reduce the granularity of this index to, say, minutes or hours rather than seconds as a way of reducing the overall size of the index.

I have investigated prefix indexes, but these do not apply to DATETIME columns, so I am wondering if there is another way to do this. I am happy to trade a little in overall query performance if it can substantially reduce the size of my index.

2

Answers


  1. The size of a DATETIME is minimally 5 bytes, plus a variable amount for fractional seconds storage. Ref: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-date-time

    MySQL has no data type that represents a "rounded off" datetime value as you describe.

    You could store DATE values with no time component, for 3 bytes per value.

    Or you could transform a DATETIME to an INT value, for example 2024082217 (based on 2024-08-22 17:56:00 as I write this). This would use 4 bytes per value.

    Honestly, I wouldn’t worry about it. A DATETIME is already pretty compact as data storage goes.

    If you find this is causing a problem, then either increase storage to fit the data you need, or else store fewer rows of data.

    It’s usually not worth spending a lot of effort for micro-optimizations, because they take time, make your project more complex, and risk creating more bugs.

    It’s not a good tradeoff. Even a few hours of a software developer’s time to implement and maintain a custom solution is far more expensive than upgrading your storage.

    Login or Signup to reply.
  2. A range query on an indexed DATETIME or TIMESTAMP is reasonably optimal.

    For example

    WHERE foo = 123
      AND dt >= CURDATE() - INTERVAL 7 DAY
      AND dt  < CURDATE()
    

    INDEX(foo, dt) will be much less efficient than INDEX(dt).

    WHERE DATE(dt) = ‘2024-03-04’is terribly inefficient (even withINDEX(dt)`) because it is not sargable .

    If neither of these examples fits your situation, let’s see your query and SHOW CREATE TABLE.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search