Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Custom conversion to Dataframe to JSON

VegDork
April 27, 2023
259 views
1 vote
2 Answers

I have a dataframe that is coming from an api, the string version looks like this:

0

2023-04-17 4.82

2023-04-18 4.82

2023-04-19 4.82

2023-04-20 4.82

2023-04-21 4.81

when I call df.to_json(orient = ‘records’) it looks like this (no good at all):

[{"0":4.82},{"0":4.82},{"0":4.82},{"0":4.82},{"0":4.81}]

What I’d really like is for it to look like this:

[{Date:"2023-04-17", "Rate":4.82},{Date:"2023-04-18", "Rate":4.82},{Date:"2023-04-19", "Rate":4.82},{Date:"2023-04-20", "Rate":4.82},{Date:"2023-04-21", "Rate":4.81}]

There could potentially be many rows of data so I need the conversion to perform well

Answers

- YevhenKuzmych
- April 27, 2023 at 6:36 pm
- 0 votes
0
You can achieve the desired output by first converting the dataframe to a list of dictionaries and then using the json module to convert it to JSON format with the desired structure.

Here’s an example code snippet that should work:
```
import json

# assuming your dataframe is named `df`
data = df.to_dict(orient='records')

# iterate over the records and format the date and rate values
formatted_data = []
for record in data:
    formatted_data.append({"Date": record["0"].split()[0], "Rate": float(record["0"].split()[1])})

# convert the formatted data to JSON
json_data = json.dumps(formatted_data)

# print the resulting JSON string
print(json_data)
```
This code first converts the dataframe to a list of dictionaries using the to_dict method with the orient parameter set to ‘records’. It then iterates over each record in the list and formats the date and rate values as required by creating a new dictionary with "Date" and "Rate" keys. Finally, it uses the json.dumps method to convert the formatted data to a JSON string.

This approach should perform well even with large dataframes since it avoids using slow loops and leverages the built-in json module for efficient serialization.
Login or Signup to reply.

- MichaelRuth
- April 27, 2023 at 7:03 pm
- 0 votes
0
Do the conversion while parsing the API response. This allows you to take advantage of any vectorization which pandas provides, particularly if you end up doing some more interesting things with the data.
```
import io
import pandas as pd
data = '''0

2023-04-17 4.82

2023-04-18 4.82

2023-04-19 4.82

2023-04-20 4.82

2023-04-21 4.81'''
df = pd.read_csv(
    io.StringIO(data),
    sep=' ',
    names=['Date', 'Rate'],
    dtype={'Date': str, 'Rate': float},
    skiprows=[0]
)
df.to_json(orient='records')
```
```
'[{"Date":"2023-04-17","Rate":4.82},{"Date":"2023-04-18","Rate":4.82},{"Date":"2023-04-19","Rate":4.82},{"Date":"2023-04-20","Rate":4.82},{"Date":"2023-04-21","Rate":4.81}]'
```
Explanation of arguments to pd.read_csv():
1. io.StringIO(data): data from API response, should be obvious
2. sep=' ': separator between cells, should be obvious
3. names=['Date', 'Rate']: column headers
4. dtype={'Date': str, 'Rate': float}: types to cast each column’s values to
5. skiprows=[0]: which rows to omit, this omits the first row since it’s just a 0 and we don’t want it in the result
Empty rows are skipped by default.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.