skip to Main Content

I have this nested json data:

    {
  "result": [
    {
    "deviceid": 33,
    "devicename": "server101",
    "objectName": "CPU",
    "data": [
    {
      "value":0.59,
      "rvalue":null
    },
    {
      "value":90,
      "rvalue":null
    },
    {
      "value": 85,
      "rvalue":null
    }
          ]
  },
  {
  "deviceid": 30,
    "devicename": "server10",
    "objectName": "CPU",
    "data": [
    {
      "value":0.30,
      "rvalue":null
    },
    {
      "value":60,
      "rvalue":null
    },
    {
      "value": 79,
      "rvalue":null
    }
    ]
  },
  {
  "deviceid": 0,
    "devicename": "server300",
    "objectName": "CPU",
    "data": [
    {
      "value":0.10,
      "rvalue":null
    },
    {
      "value":0.20,
      "rvalue":null
    },
    {
      "value": 0.25,
      "rvalue:":null
    }]
  }
  ],
  "timeRanges": [
    {
      "name":"1st Month",
      "startTime":1680000000,
      "endTime": 1689000000
    },
    {
      "name":"2nd Month",
      "startTime": 1680000000,
      "endTime": 1689000000
    },
    {
      "name":"3rd Month",
      "startTime": 1680000000,
      "endTime": 1689000000 
    }
  ]
}

I need to extract data from this json and append to a data frame.

The output should be like this:

deviceid deviceName objectName 1stMonth 2ndMonth 3rdMonth. startTime. endTime
33      server101  CPU      0.59     90      85       1680000000  1689000000
30      server10   CPU      0.30     60      79      1680000000  1689000000
0       server300  CPU      0.10     0.20    0.25     1680000000  1689000000

I am very new to this and would appreciate any guidance.

2

Answers


  1. This code below seems to produce the desired result. I had to convert null to "null" as Python didn’t recognise the null value.

    import pandas as pd
    
    adict =    {
      "result": [
        {
        "deviceid": 33,
        "devicename": "server101",
        "objectName": "CPU",
        "data": [
        {
          "value":0.59,
          "rvalue":"null"
        },
        {
          "value":90,
          "rvalue":"null"
        },
        {
          "value": 85,
          "rvalue":"null"
        }
              ]
      },
      {
      "deviceid": 30,
        "devicename": "server10",
        "objectName": "CPU",
        "data": [
        {
          "value":0.30,
          "rvalue":"null"
        },
        {
          "value":60,
          "rvalue":"null"
        },
        {
          "value": 79,
          "rvalue":"null"
        }
        ]
      },
      {
      "deviceid": 0,
        "devicename": "server300",
        "objectName": "CPU",
        "data": [
        {
          "value":0.10,
          "rvalue":"null"
        },
        {
          "value":0.20,
          "rvalue":"null"
        },
        {
          "value": 0.25,
          "rvalue:":"null"
        }]
      }
      ],
      "timeRanges": [
        {
          "name":"1st Month",
          "startTime":1680000000,
          "endTime": 1689000000
        },
        {
          "name":"2nd Month",
          "startTime": 1680000000,
          "endTime": 1689000000
        },
        {
          "name":"3rd Month",
          "startTime": 1680000000,
          "endTime": 1689000000 
        }
      ]
    }
    
    rows = []
    for ind in range(len(adict["result"])):
        row = [adict["result"][ind]["deviceid"] , adict["result"][ind]["devicename"], 
               adict["result"][ind]["objectName"],  
               adict["result"][ind]["data"][0]["value"],  adict["result"][ind]["data"][1]["value"], 
               adict["result"][ind]["data"][2]["value"], 
               adict["timeRanges"][ind]["startTime"],  adict["timeRanges"][ind]["endTime"]
              ]
        rows.append(row)
    
    column_names = ["deviceid", "deviceName", "objectName", "1stMonth",
                    "2ndMonth", "3rdMonth", "startTime", "endTime"]
    df = pd.DataFrame(data = rows, columns=column_names)
    print(df)
    

    Output:

       deviceid deviceName objectName  1stMonth  2ndMonth  3rdMonth   startTime     endTime
    0        33  server101        CPU      0.59      90.0     85.00  1680000000  1689000000
    1        30   server10        CPU      0.30      60.0     79.00  1680000000  1689000000
    2         0  server300        CPU      0.10       0.2      0.25  1680000000  1689000000
    
    Login or Signup to reply.
  2. After fixing some typos in your json file, you can use :

    with open("file.json") as file:
        obj = json.load(file)
    
    df = (
        pd.json_normalize(
            obj["result"], "data", ["deviceid", "devicename", "objectName"])
            .join(pd.concat([pd.DataFrame(obj["timeRanges"])]*len(obj["result"]),
                 ignore_index=True)).pivot(
            index=["deviceid", "devicename", "objectName", "startTime", "endTime"],
            columns="name", values="value").reset_index().rename_axis(None, axis=1)
    )
    

    Output :

    print(df.to_string(index=False))
    
     deviceid devicename objectName  startTime    endTime  1st Month  2nd Month  3rd Month
            0  server300        CPU 1680000000 1689000000       0.10       0.20       0.25
           30   server10        CPU 1680000000 1689000000       0.30      60.00      79.00
           33  server101        CPU 1680000000 1689000000       0.59      90.00      85.00
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search