skip to Main Content

everyone!

I am trying to convert a Pandas Dataframe (data1) into specific json format string (data2).
How do I do that?

data1 = pd.DataFrame(
    {
        "country": ["USA", "USA", "USA", "Canada", "Canada", "Canada", "Japan", "Japan", "Germany", "Germany", "Germany", "Germany"],
        "city": ["Boston", "NY", "LA", "Montreal", "Vancouver", "Calgary", "Osaka", "Kyoto","Berlin", "Frankfurt", "Hamburg", "Dresden"]
    }
)

……..

data2 = [
                {"country": "USA", "city": ["Boston", "NY", "LA"]},
                {"country": "Canada", "city":  ["Montreal", "Vancouver", "Calgary'"]},
                {"country": "Japan", "city": ["Kyoto", "Osaka"]},
                {"country": "Germany", "city": ["Berlin", "Frankfurt", "Hamburg", "Dresden"]}
        ]

Originally data1 is coming from SQL. After converting it to to data2 I will pass it further to JS (via data2_json = json.dumps(data, indent=4))and other Python file to be displayed in a WebApp.

Thanks for any help in advance.

3

Answers


  1. Use groupby.agg and to_dict:

    out = (data1.groupby('country', as_index=False, sort=False).agg(list)
                .to_dict('records')
          )
    

    Output:

    [{'country': 'USA', 'city': ['Boston', 'NY', 'LA']},
     {'country': 'Canada', 'city': ['Montreal', 'Vancouver', 'Calgary']},
     {'country': 'Japan', 'city': ['Osaka', 'Kyoto']},
     {'country': 'Germany', 'city': ['Berlin', 'Frankfurt', 'Hamburg', 'Dresden']}]
    

    Or to_json for a json string:

    out = (data1.groupby('country', as_index=False, sort=False).agg(list)
                .to_json(orient='records')
          )
    

    Output:

    '[{"country":"USA","city":["Boston","NY","LA"]},{"country":"Canada","city":["Montreal","Vancouver","Calgary"]},{"country":"Japan","city":["Osaka","Kyoto"]},{"country":"Germany","city":["Berlin","Frankfurt","Hamburg","Dresden"]}]'
    
    Login or Signup to reply.
  2. This should do it for you. They key here is groupby using country

    import json
    result = data1.groupby('country', as_index=False, sort=False).agg(list).to_json(orient="records")
    parsed = json.loads(result)
    print(json.dumps(parsed, indent=4) )
    
    Login or Signup to reply.
  3. You can group the pandas dataframe by the "country" column and then convert the grouped data into JSON:

    import pandas as pd
    import json
    
    data1 = pd.DataFrame(
        {
            "country": ["USA", "USA", "USA", "Canada", "Canada", "Canada", "Japan", "Japan", "Germany", "Germany", "Germany", "Germany"],
            "city": ["Boston", "NY", "LA", "Montreal", "Vancouver", "Calgary", "Osaka", "Kyoto","Berlin", "Frankfurt", "Hamburg", "Dresden"]
        }
    )
    
    # Group the dataframe by "country" and aggregate "city" into a list
    grouped_data = data1.groupby('country')['city'].apply(list).reset_index()
    
    # Create the JSON in a pretty format
    data2 = grouped_data.to_dict(orient='records')
    data2_json = json.dumps(data2, indent=4)
    print(data2_json)
    

    You can remove the indent argument if you prefer your JSON unformatted.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search