skip to Main Content

I have a large script that parses js with a dataframe entry, but to shorten the question, I put what I need in a separate variable.
My variable contains the following value

value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"

I apply the following script and get data like this

value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"

def parse_json(value):
    arr = value.split("},")
    arr = [x+"}" for x in arr]
    arr[-1] = arr[-1][:-1]
    
    return json.dumps({str(i):add_quotation_marks(x) for i, x in enumerate(arr)})

def add_quotation_marks(value):
    words = re.findall(r'(w+:)', value)
    for word in words:
        value = value.replace(word[:-1], f'"{word[:-1]}"')
    return json.loads(value)

print(parse_json(value))
{"0": {"from": [3, 4], "to": [7, 4], "color": 2}, "1": {"from": [3, 6], "to": [10, 6], "color": 3}}

The script executes correctly, but I need to get a slightly different result.
This is what the result I want to get looks like:

{
  "0": {
    "from": {
      "0": "3",
      "1": "4"
    },
    "to": {
      "0": "7",
      "1": "4"
    },
    "color": "2"
  },
  "1": {
    "from": {
      "0": "3",
      "1": "6"
    },
    "to": {
      "0": "10",
      "1": "6"
    },
    "color": "3"
  }
}

This is valid json and valid yaml. Please tell me how can I do this

2

Answers


  1. I’d suggest a regex approach in this case:

    res = []
    
    # iterates over each "{from:...,to:...,color:...}" group separately
    for obj in re.findall(r'{([^}]+)}', value):
        item = {}
    
        # iterates over each "...:..." key-value separately
        for k, v in re.findall(r'(w+):([[^]]+]|d+)', obj):
            if v.startswith('['):
                v = v.strip('[]').split(',')
    
            item[k] = v
    
        res.append(item)
    

    This produces this output in res:

    [{'from': ['3', '4'], 'to': ['7', '4'], 'color': '2'}, {'from': ['3', '6'], 'to': ['10', '6'], 'color': '3'}]
    

    Since your values can contain commas, trying to split on commas or other markers is fairly tricky, and using these regexes to match your desired values instead is more stable.

    Login or Signup to reply.
  2. Here’s the code that converts the the value to your desired output.

    import json5  # pip install json5
    
    value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
    
    def convert(str_value):
        str_value = f"[{str_value}]"  # added [] to make it a valid json
        parsed_value = json5.loads(str_value)  # convert to python object
        output = {}  # create empty dict
    
        # Loop through the list of dicts. For each item, create a new dict
        # with the index as the key and the value as the value. If the value
        # is a list, convert it to a dict with the index as the key and the
        # value as the value. If the value is not a list, just add it to the dict.
        for i, d in enumerate(parsed_value):
            output[i] = {}
            for k, v in d.items():
                output[i][k] = {j: v[j] for j in range(len(v))} if isinstance(v, list) else v
    
        return output
    
    print(json5.dumps(convert(value)))
    

    Output

    {
      "0": {     
        "from": {
          "1": 4
        },
        "to": {
          "0": 7,
          "1": 4
        },
        "color": 2
      },
      "1": {
        "from": {
          "0": 3,
          "1": 6
        },
        "to": {
          "0": 10,
          "1": 6
        },
        "color": 3
      }
    }
    
    • json5 package allows you to convert a javascrip object to a python dictionary so you dont have to do split("},{").
    • Then added [ and ] to make the string a valid json.
    • Then load the string using json5.loads()
    • Now you can loop through the dictionary and convert it to desired output format.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search