skip to Main Content

I got a strainge bug.

Got 2 the same servers. Both ubuntu 22.04
both running Python 3.10.6

First server I run my code all well:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'
>>> date_time_obj = datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
>>> print ("The type of the date is now",  type(date_time_obj))
The type of the date is now <class 'datetime.datetime'>
>>> print ("The date is", date_time_obj)
The date is 2023-02-28 11:27:38
>>>

Second server I do the same:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'
>>> date_time_obj = datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data 'Tue 28 Feb 2023 11:27:38 AM CET' does not match format '%a %d %b %Y %I:%M:%S %p %Z'
>>>

What could be causing this issue? its cleary not down to the format as its correct.

2

Answers


  1. Chosen as BEST ANSWER

    @FObersteiner your remark with matches the time zone on the machine seems to be the reason.

    server that has the ValueError.

    server2:~$ timedatectl
                   Local time: Tue 2023-02-28 15:06:10 UTC
               Universal time: Tue 2023-02-28 15:06:10 UTC
                     RTC time: Tue 2023-02-28 15:06:10
                    Time zone: Etc/UTC (UTC, +0000)
    System clock synchronized: yes
                  NTP service: active
              RTC in local TZ: no
    

    The server without the ValueError:

    server1:~$ timedatectl
                   Local time: Tue 2023-02-28 16:05:07 CET
               Universal time: Tue 2023-02-28 15:05:07 UTC
                     RTC time: Tue 2023-02-28 15:05:07
                    Time zone: Europe/Amsterdam (CET, +0100)
    System clock synchronized: yes
                  NTP service: active
    

    Since the code that was causing the issue is not mine change the system time zone made the application run without errors.


  2. The Python strptime/strftime documentation is a bit secretive about %Z: It does not parse arbitrary time zone abbreviations1. If you scroll down to the technical detail section, you can find:

    1. […]
      %Z […]
      strptime() only accepts certain values for %Z:

      • any value in time.tzname for your machine’s locale
      • the hard-coded values UTC and GMT

    The first point explains why your attempt works on some systems but not on others.


    How to parse reliably

    "CET" is an abbreviated tz name. Many of those are ambiguous, so parsers likely refuse to parse them2. A way around is to define which abbreviation maps to which IANA time zone name with dateutils parser:

    from datetime import datetime
    import dateutil # pip install python-dateutil
    
    tzmapping = {"CET": dateutil.tz.gettz("Europe/Berlin")}
    
    print(dateutil.parser.parse('Tue 28 Feb 2023 11:27:38 AM CET', tzinfos=tzmapping))
    
    2023-02-28 11:27:38+01:00
    

    If you want to have more control over the parsing process, you can implement something similar yourself, e.g.

    from datetime import datetime
    from zoneinfo import ZoneInfo # Python 3.9+ standard library
    
    tzmapping = {"CET": ZoneInfo("Europe/Berlin")}
    
    date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'
    
    # separate datetime part and timezone part:
    dt, tz = date_time_str.rsplit(" ", maxsplit=1)
    
    # now parse datetime part and set timezone.
    date_time_obj = datetime.strptime(dt, '%a %d %b %Y %I:%M:%S %p').replace(tzinfo=tzmapping[tz])
    
    print(date_time_obj)
    # 2023-02-28 11:27:38+01:00
    
    print(repr(date_time_obj))
    # datetime.datetime(2023, 2, 28, 11, 27, 38, tzinfo=zoneinfo.ZoneInfo(key='Europe/Berlin'))
    

    1 In fact, %Z doesn’t parse anything in a strict sense; it just makes the parser ignore strings like "GMT" or "UTC". The resulting datetime object will still be naive!

    2 Besides, CET specifies a UTC offset, not a time zone in a geographical sense. For instance "Europe/Berlin" and "Europe/Paris" both experience CET but are different time zones.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search