skip to Main Content

I have a problem with sorting data. When I use the json_dic.sort(key= lambda x: x['lastLogonTimestamp'], reverse= False) command I get an error TypeError: '<' not supported between instances of 'str' and 'list'. When I checked the output with the type() function, I found that I was getting the response in two classes, <class 'list'> for objects that are empty "lastLogonTimestamp": [] and <class 'str'> for those that have a value dates "lastLogonTimestamp": "2021-02-15 06:35:34.363626+00:00"

My input:

json_dic = [
    {
        "lastLogonTimestamp": [],
        "sAMAccountName": "batman"
    },
    {
        "lastLogonTimestamp": "2021-02-15 06:35:34.363626+00:00",
        "sAMAccountName": "superman"
    },
    {
        "lastLogonTimestamp": "2022-09-12 04:08:01.311201+00:00",
        "sAMAccountName": "green-lantern"
    },
    {
        "lastLogonTimestamp": "2022-09-13 04:48:43.275908+00:00",
        "sAMAccountName": "wonder-woman"
    },
    {
        "lastLogonTimestamp": [],
        "sAMAccountName": "hulk"
    }
]

EDIT: after following the advice I used this and it works fine

def lastLogonTimestamp_sort(value):
    timestamp = value['lastLogonTimestamp']
    is_str = isinstance(timestamp, str)
    return is_str, str(timestamp)

json_dic.sort(key=lastLogonTimestamp_sort, reverse=False)

4

Answers


  1. Convert the sort key to a str. This will not modify the original type:

    >>> json_dic.sort(key= lambda x: str(x['lastLogonTimestamp']))
    >>> json_dic
    [
      {
        "lastLogonTimestamp": "2021-02-15 06:35:34.363626+00:00",
        "sAMAccountName": "superman"
      },
      {
        "lastLogonTimestamp": "2022-09-12 04:08:01.311201+00:00",
        "sAMAccountName": "green-lantern"
      },
      {
        "lastLogonTimestamp": "2022-09-13 04:48:43.275908+00:00",
        "sAMAccountName": "wonder-woman"
      },
      {
        "lastLogonTimestamp": [],
        "sAMAccountName": "batman"
      },
      {
        "lastLogonTimestamp": [],
        "sAMAccountName": "hulk"
      }
    ]
    

    If you want empty lists to come first, you could use:

    >>> json_dic.sort(key= lambda x: x['lastLogonTimestamp'] or "")
    
    Login or Signup to reply.
  2. You need to make sure that the values returned by the key function have a uniform type. One option would be to convert to a string in all cases.

    However, if you just transform the items into a string for the sort, you may end up with non-strings being sorted in between actual strings because of how they resolve with str.

    To make sorting more predictable you can make a key function that returns a tuple of items to compare on, one per condition you want to check.

    def timestamp_sort(value):
        timestamp = value['lastLogonTimestamp']
        is_str = isinstance(timestamp, str)
        return is_str, str(timestamp)
    
    
    json_dic.sort(key=timestamp_sort, reverse=False)
    print(json.dumps(json_dic, indent=2))
    
    [
      {
        "lastLogonTimestamp": [],
        "sAMAccountName": "batman"
      },
      {
        "lastLogonTimestamp": [],
        "sAMAccountName": "hulk"
      },
      {
        "lastLogonTimestamp": "2021-02-15 06:35:34.363626+00:00",
        "sAMAccountName": "superman"
      },
      {
        "lastLogonTimestamp": "2022-09-12 04:08:01.311201+00:00",
        "sAMAccountName": "green-lantern"
      },
      {
        "lastLogonTimestamp": "2022-09-13 04:48:43.275908+00:00",
        "sAMAccountName": "wonder-woman"
      }
    ]
    

    By changing return is_str, str(timestamp) to return not is_str, str(timestamp) you can now easily flip if non string values show up at the start or the back of the list, but never in between.

    Login or Signup to reply.
  3. When there’s an empty list in the lastLogonTimestamp field, use an empty string as the key. Otherwise, use lastLogonTimestamp.

    Using a helper function instead of a lambda function makes it easier to see what’s happening.

    json_dic.sort(key=use_timestamp_key, reverse=False)
    
    ...
    
    def use_timestamp_key(list_item):
        if isinstance(list_item["lastLogonTimestamp"], list):
            return ""
        return list_item["lastLogonTimestamp"]
    
    Login or Signup to reply.
  4. The issue with sorting timestamps as strings (as proposed in the other answers) is that 2022-09-13 04:48:43.275908+00:00 will sort before 2022-09-13 04:48:43.275908+01:00 even though it is actually a later time. It would be better to sort using datetime objects instead. For example:

    from datetime import datetime
    
    def ts(item):
        try:
            return datetime.strptime(item['lastLogonTimestamp'], '%Y-%m-%d %H:%M:%S.%f%z').timestamp()
        except:
            return 0    # errors will sort first
            # return inf if you want errors to sort last instead'
    
    sorted(json_dic, key=ts)
    

    Output:

    [
     {'lastLogonTimestamp': [], 'sAMAccountName': 'batman'},
     {'lastLogonTimestamp': [], 'sAMAccountName': 'hulk'},
     {'lastLogonTimestamp': '2021-02-15 06:35:34.363626+00:00', 'sAMAccountName': 'superman'},
     {'lastLogonTimestamp': '2022-09-12 04:08:01.311201+00:00', 'sAMAccountName': 'green-lantern'},
     {'lastLogonTimestamp': '2022-09-13 04:48:43.275908+00:00', 'sAMAccountName': 'wonder-woman'}
    ]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search