skip to Main Content

I’ve used most of methods to load a json and the results I got didn’t meet my needs. When I load the file with pandas read_json, I get a dataframe with the index and all fields in a single line, as a string. Here’s the two calls I do that doesn’t work for me:

import json
import pandas as pd
#Open and read the JSON file

with open('faturamento.json') as f:
    d = json.load(f)
    print(d)


fat = pd.read_json("faturamento.json")

And faturamento.json is:

{
    "faturamento": [
        {
            "day": "2024-09-01",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-02",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-03",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-04",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-05",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-06",
            "fat": "1000",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-07",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-08",
            "fat": "1500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-09",
            "fat": "1500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-10",
            "fat": "1500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-11",
            "fat": "1500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-12",
            "fat": "1500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-13",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-14",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-15",
            "fat": "2500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-16",
            "fat": "2500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-17",
            "fat": "2500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-18",
            "fat": "2500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-19",
            "fat": "2500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-20",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-21",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-22",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-23",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-24",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-25",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-26",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-27",
            "fat": " 500",
            "business_day": "TRUE"
        },
        {
            "day": "2024-09-28",
            "fat": "1500",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-29",
            "fat": "   0",
            "business_day": "FALSE"
        },
        {
            "day": "2024-09-30",
            "fat": "3500",
            "business_day": "TRUE"
        }
    ]
}

I thought that read_json would do the trick. How can I really have a dataframe with three columns: day, fat, business_day?

2

Answers


  1. you need to use pd.json_normalize:

    with open(file) as f:
        data = json.load(f)
        
    df = pd.json_normalize(data['faturamento'])
    

    Output:

               day   fat    business_day
    0   2024-09-01   500         TRUE
    1   2024-09-02   500         TRUE
    2   2024-09-03   500         TRUE
    3   2024-09-04   500         TRUE
    4   2024-09-05   500         TRUE
    5   2024-09-06  1000        FALSE
    6   2024-09-07     0        FALSE
    7   2024-09-08  1500         TRUE
    8   2024-09-09  1500         TRUE
    9   2024-09-10  1500         TRUE
    10  2024-09-11  1500         TRUE
    11  2024-09-12  1500         TRUE
    12  2024-09-13     0        FALSE
    13  2024-09-14     0        FALSE
    14  2024-09-15  2500         TRUE
    15  2024-09-16  2500         TRUE
    16  2024-09-17  2500         TRUE
    17  2024-09-18  2500         TRUE
    18  2024-09-19  2500         TRUE
    19  2024-09-20     0        FALSE
    20  2024-09-21     0        FALSE
    21  2024-09-22   500         TRUE
    22  2024-09-23   500         TRUE
    23  2024-09-24   500         TRUE
    24  2024-09-25   500         TRUE
    25  2024-09-26   500         TRUE
    26  2024-09-27   500         TRUE
    27  2024-09-28  1500        FALSE
    28  2024-09-29     0        FALSE
    29  2024-09-30  3500         TRUE
    
    Login or Signup to reply.
  2. Two options:

    df = pd.json_normalize(pd.read_json("faturamento.json")["faturamento"])
    

    or

    with open("faturamento.json") as f:
        d = json.load(f)
    df = pd.DataFrame(d["faturamento"])
    

    Both will output:

               day   fat business_day
    0   2024-09-01   500         TRUE
    1   2024-09-02   500         TRUE
    2   2024-09-03   500         TRUE
    3   2024-09-04   500         TRUE
    4   2024-09-05   500         TRUE
    5   2024-09-06  1000        FALSE
    6   2024-09-07     0        FALSE
    7   2024-09-08  1500         TRUE
    8   2024-09-09  1500         TRUE
    9   2024-09-10  1500         TRUE
    10  2024-09-11  1500         TRUE
    11  2024-09-12  1500         TRUE
    12  2024-09-13     0        FALSE
    13  2024-09-14     0        FALSE
    14  2024-09-15  2500         TRUE
    15  2024-09-16  2500         TRUE
    16  2024-09-17  2500         TRUE
    17  2024-09-18  2500         TRUE
    18  2024-09-19  2500         TRUE
    19  2024-09-20     0        FALSE
    20  2024-09-21     0        FALSE
    21  2024-09-22   500         TRUE
    22  2024-09-23   500         TRUE
    23  2024-09-24   500         TRUE
    24  2024-09-25   500         TRUE
    25  2024-09-26   500         TRUE
    26  2024-09-27   500         TRUE
    27  2024-09-28  1500        FALSE
    28  2024-09-29     0        FALSE
    29  2024-09-30  3500         TRUE
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search