Write each row of CSV to its own JSON; rows are formatted for JSON already

longhorn
December 19, 2024
124 views
0 votes
2 Answers

I have a CSV file in which each row is a string that is already formatted for JSON. Some of the elements are nested within a given key. Here is a simplified sample row:

{"key1":"string_value","key2":["string_value"],"key4":integer,"key5":[{"nested_key1":"string_value","nested_key2":boolean}],"key6":integer}

I need to take each row and write it to its own JSON file, named row_#.json.

The closest I’ve come is with this code:


import csv

csv_file = 'path/to/my/file.csv'

with open(csv_file, mode='r') as infile:
    for i, line in enumerate(infile):
        with open(f"row_{i}.json", "w") as outfile:
            outfile.write(line)

The outputted files show each key and each string value wrapped in an additional set of double quotes. The entire file contents are also wrapped in quotes. Using the example from above, I end up with:

"{""key1"":""string_value"",""key2"":[""string_value""],""key4"":integer,""key5"":[{""nested_key1"":""string_value"",""nested_key2"":boolean}],""key6"":integer}"

How can I output the original row contents without this extra formatting? Can I do it without reading in the outputted JSON files and trying to replace the strings? Note that I’m not able to go back to the source and format as a proper CSV before reading the file into Python.

Tags: json python

Answers

- GoPackGo
- December 18, 2024 at 11:35 pm
- 0 votes
0
The issue you’re encountering happens because the csv.reader or direct reading of the file as strings wraps each line in double quotes and escapes internal quotes when writing back to a file. To resolve this issue, treat each line as a raw string and skip any unnecessary parsing or quoting.
```
# Path to your CSV file
csv_file = 'path/to/my/file.csv'

# Open the CSV file
with open(csv_file, mode='r') as infile:
    # Use enumerate to keep track of row numbers
    for i, line in enumerate(infile):
        # Strip any leading/trailing whitespace or newline characters
        json_content = line.strip()
        
        # Write each row as its own JSON file
        with open(f"row_{i}.json", "w") as outfile:
            # Write the raw JSON string without adding additional quotes
            outfile.write(json_content)
```
Login or Signup to reply.

- Barmar
- December 18, 2024 at 11:44 pm
- 0 votes
0
Since the original file is supposed to be a CSV, I suspect those extra quotes are in the file; they’re needed to handle commas and quotes nested in the field value. I’m guessing you don’t see them because you’re viewing the file with a spreadsheet application, not looking at the raw file contents.

So you need to use a CSV reader to parse the file, then write the raw text to the output files.
```
import csv

csv_file = 'path/to/my/file.csv'

with open(csv_file, mode='r') as infile:
    in_csv = csv.reader(infile)
    for i, line in enumerate(in_csv):
        with open(f"row_{i}.json", "w") as outfile:
            outfile.write(line[0] + "n")
```
line[0] is needed because csv.reader() parses each row into a list of fields.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.