skip to Main Content

I am trying to grab data from JSON file based on what quarter the dates represent. My goal is to assign the data to a variable so I should have Q1, Q2, Q3, Q4 variables holding the data inside. Below is the JSON:

{
    "lastDate":{
        "0":"2022Q4",
        "1":"2022Q4",
        "2":"2022Q4",
        "7":"2022Q4",
        "8":"2022Q4",
        "9":"2022Q4",
        "18":"2022Q3",
        "19":"2022Q3",
        "22":"2022Q3",
        "24":"2022Q2"
    },
    "transactionType":{
        "0":"Sell",
        "1":"Automatic Sell",
        "2":"Automatic Sell",
        "7":"Automatic Sell",
        "8":"Sell",
        "9":"Automatic Sell",
        "18":"Automatic Sell",
        "19":"Automatic Sell",
        "22":"Automatic Sell",
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "0":"20,200",
        "1":"176,299",
        "2":"8,053",
        "7":"167,889",
        "8":"13,250",
        "9":"176,299",
        "18":"96,735",
        "19":"15,366",
        "22":"25,000",
        "24":"25,000"
    }
}

Now if i try to use the following code:

import json

data = json.load(open("AAPL22data.json"))

Q2data = [item for item in data if '2022Q2' in data['lastDate']]
print(Q2data)

My ideal output should be:

{
    "lastDate":{
        "24":"2022Q2"
    },
    "transactionType":{
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "24":"25,000"
    }
}

And then repeat the same structure for the other quarters. However, my current output gives me "[ ]"

3

Answers


  1. Chosen as BEST ANSWER

    Thanks to @FrancoMilanese for the info on Pandas group_by here is the answer below:

    import json
    import pandas as pd 
    
    data = json.load(open("AAPL22data.json"))
    
    df = pd.DataFrame.from_dict(data)
    
    q2df = df.groupby('lastDate')
    
    q2df.get_group('2022Q2') #change '2022q2' for others & assign to a different variable
    

  2. With pandas you can read this nested dictionary a transform it to a table representation. Then the aggregation you are required becomes quite natural.

    import pandas as pd 
    
    sample_dict = {
        "lastDate":{
            "0":"2022Q4",
            "1":"2022Q4",
            "2":"2022Q4",
            "7":"2022Q4",
            "8":"2022Q4",
            "9":"2022Q4",
            "18":"2022Q3",
            "19":"2022Q3",
            "22":"2022Q3",
            "24":"2022Q2"
        },
        "transactionType":{
            "0":"Sell",
            "1":"Automatic Sell",
            "2":"Automatic Sell",
            "7":"Automatic Sell",
            "8":"Sell",
            "9":"Automatic Sell",
            "18":"Automatic Sell",
            "19":"Automatic Sell",
            "22":"Automatic Sell",
            "24":"Automatic Sell"
        },
        "sharesTraded":{
            "0":"20,200",
            "1":"176,299",
            "2":"8,053",
            "7":"167,889",
            "8":"13,250",
            "9":"176,299",
            "18":"96,735",
            "19":"15,366",
            "22":"25,000",
            "24":"25,000"
        }
    }
    
    print(pd.DataFrame.from_dict(sample_dict))
    

    returns

    Output:
    
       lastDate transactionType sharesTraded
    0    2022Q4            Sell       20,200
    1    2022Q4  Automatic Sell      176,299
    2    2022Q4  Automatic Sell        8,053
    7    2022Q4  Automatic Sell      167,889
    8    2022Q4            Sell       13,250
    9    2022Q4  Automatic Sell      176,299
    18   2022Q3  Automatic Sell       96,735
    19   2022Q3  Automatic Sell       15,366
    22   2022Q3  Automatic Sell       25,000
    24   2022Q2  Automatic Sell       25,000
    

    then a simple group_by should do the trick.

    Login or Signup to reply.
  3. Use a dictionary comprehension:

    import json
    
    my_json = """{
        "lastDate":{
            "0":"2022Q4",
            "1":"2022Q4",
            "2":"2022Q4",
            "7":"2022Q4",
            "8":"2022Q4",
            "9":"2022Q4",
            "18":"2022Q3",
            "19":"2022Q3",
            "22":"2022Q3",
            "24":"2022Q2"
        },
        "transactionType":{
            "0":"Sell",
            "1":"Automatic Sell",
            "2":"Automatic Sell",
            "7":"Automatic Sell",
            "8":"Sell",
            "9":"Automatic Sell",
            "18":"Automatic Sell",
            "19":"Automatic Sell",
            "22":"Automatic Sell",
            "24":"Automatic Sell"
        },
        "sharesTraded":{
            "0":"20,200",
            "1":"176,299",
            "2":"8,053",
            "7":"167,889",
            "8":"13,250",
            "9":"176,299",
            "18":"96,735",
            "19":"15,366",
            "22":"25,000",
            "24":"25,000"
        }
    }"""
    
    data = json.loads(my_json)
    
    var = "24" #This corresponds to 2022 Q2 in your example
    
    data = {k:{var: v[var]} for k, v in data.items()}
    data = json.dumps(data, indent = 2)
    
    print(data)
    

    Output:

    {
      "lastDate": {
        "24": "2022Q2"
      },
      "transactionType": {
        "24": "Automatic Sell"
      },
      "sharesTraded": {
        "24": "25,000"
      }
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search