skip to Main Content

I am trying to output a dictionary as JSON, using python 2.7 (this can not be upgraded)
The keys in the data are strings that contain numbers, like 'item_10', and have an arbitrary order. For example, this code generates some test data:

import random

data = {}
numbers = list(range(1, 12))
random.shuffle(numbers)
for value in numbers:
    data['item_{}'.format(value)] = 'data{}'.format(value)

I tried using:

print(json.dumps(data, sort_keys=True, indent=2))

However, I want the keys to be sorted naturally, like:

{
  "item_1": "data1",  
  "item_2": "data2",
  ...
  "item_10": "data10",
  "item_11": "data11"
}

Instead, I get keys sorted by Python’s default sort order:

{
  "item_1": "data1",
  "item_10": "data10",
  "item_11": "data11",
  ...
  "item_2": "data2"
}

How can I get this result?

2

Answers


  1. By making the keys "naturally comparable"

    Supposing that we have a key function that implements the natural-sort comparison, as in Claudiu’s answer for the related question:

    import re
    
    def natural_sort_key(s, _nsre=re.compile('([0-9]+)')):
        return [int(text) if text.isdigit() else text.lower()
                for text in _nsre.split(s)]
    

    Then we can create a wrapper class for strings which is compared using that function, transform the keys of the dict, and proceed as before:

    from functools import total_ordering
    
    @total_ordering
    class NaturalSortingStr(str):
        def __lt__(self, other):
            return natural_sort_key(self) < natural_sort_key(other)
    
    fixed = {NaturalSortingStr(k):v for k, v in data.items()}
    
    print(json.dumps(fixed,sort_keys=True,indent=2))
    

    Note that functools.total_ordering is introduced in Python 3.2. In older versions, we should instead define __gt__, __le__ and __ge__ explicitly, in corresponding ways. (Python’s sort algorithm should not use these, but it is a good idea to include consistent definitions for correctness.) Of course, the base str‘s implementations of __eq__ and __ne__ do not need to be replaced.

    (In 2.7 we could also instead implement a corresponding __cmp__, but this will break in 3.x.)

    By putting the keys in order first

    In 3.7 and up, dictionary keys are guaranteed to preserve their order; in earlier versions, we can use collections.OrderedDict to get that property. Note that this does not sort keys, but maintains the order of insertion.

    Thus, we can determine the necessary order for keys, and create a new dict by inserting keys in that order:

    import sys
    if sys.version_info < (3, 7):
        # In newer versions this would also work, but is unnecessary
        from collections import OrderedDict as odict
    else:
        odict = dict
    
    sorted_keys = sorted(data.keys(), key=natural_sort_key)
    sorted_data = odict((k, data[k]) for k in sorted_keys)
    print(json.dumps(sorted_data, indent=2))
    

    Since the data was sorted ahead of time, sort_keys=True is no longer necessary. In modern versions, since we are using the built-in dict, we could also write a dict comprehension (rather than passing a generator to the constructor).

    Login or Signup to reply.
  2. Using simplejson instead of the standard library

    The third-party simplejson library is the original basis of Python’s standard library JSON support; however, it is actively maintained by the original developer, and the standard library uses very old versions, relatively speaking. For example, in Python 3.8, the standard library appears to be based on simplejson 2.0.9; as of posting, the latest version of simplejson is 3.18.3.

    Using an up-to-date version of simplejson, we can simply specify the sort key as item_sort_key:

    import simplejson as json
    import re
    
    # Again using Claudiu's implementation
    def natural_sort_key(s, _nsre=re.compile('([0-9]+)')):
        return [int(text) if text.isdigit() else text.lower()
                for text in _nsre.split(s)]
    
    print(json.dumps(data, item_sort_key=natural_sort_key, indent=2))
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search