skip to Main Content

MediaWiki (the free software behind Wikipedia) stores database timestamps in a unique binary(14) format for fields of the database. This is described further in their timestamp documentation.

The format of timestamps used in MediaWiki URLs and in some of the
MediaWiki database fields is yyyymmddhhmmss. For example, the
timestamp for 2023-01-20 17:12:22 (UTC) is 20230120171222. The
timezone for these timestamps is UTC.

I have also seen a similar timestamp format in other places such as URLs for the Internet Archive. I am regularly needing to compare these timestamps against timestamps which are stored in a standard Unix timestamp format (seconds from the Unix epoch). I believe this should be a common format so it surprises me that I can’t find a ready-made solution to easily convert from the MediaWiki format to a Unix timestamp.

What I’m most interested in is the best way to do this conversion. That is:

  • Relatively short/simple to understand code.
  • Most efficient algorithm.
  • Does detect errors in original format.

There is apparently a function that MediaWiki includes for conversion named "wfTimestamp" however I haven’t been able to locate this function itself or the source code online and I understand it has a large number of unnecessary features beyond the simple conversion. One potential solution may be to remove other parts of that function, but I still don’t know if that function is the optimal solution or if there’s a better way. There are lots of questions on the more general conversion to timestamps but I’m hoping for something specific to this format. I’ve thought of a lot of ways to solve it such as a regular expression, mktime after string split, strtotime, etc… but I’m not sure which will be fastest for this particular task/time format if it had to be done a lot of times. I am assuming since this format exists in at least two places, an optimal solution for this specific format conversion could be useful for others as well. Thanks.

3

Answers


  1. You can use DateTime::createFromFormat function with specified format.

    $date = DateTime::createFromFormat("YmdHis", "20230120171222", new DateTimeZone('UTC'));
    $timestamp = $date->getTimestamp();
    

    I’m not sure that you can find more optimised way, because even if you will parse this manually, you have to consider that there are leap years and not every day has exactly 24 hours. PHP does it for you.

    Login or Signup to reply.
  2. I think this is what you’re probably looking:

    $timestamp = strtotime("20230120171222"); 
    // 1674234742
    

    The Unix timestamp that this function returns does not contain information about time zones. In order to do calculations with date/time information, you should use the more capable DateTimeImmutable.

    Please see here: https://www.php.net/manual/en/function.strtotime.php

    Login or Signup to reply.
  3. In order to interpret the string "20230120171222" as UTC time, the time zone must be specified with strtotime or the default time zone must be set to UTC.

    $dateStr = "20230120171222"; 
    $timestamp = strtotime($dateStr.' UTC');
    var_dump($timestamp); //int(1674234742)
    

    See this example for comparison.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search