I’m trying to read a file with multiple columns. One such column is named ‘answer’, containing values that pretty much are Python dictionaries.
Values include:
{'number': '2', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': []}
and
{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ['4-yard', '31-yard']}
A particular row’s value in the ‘answer’ column is when printed onto the console displayed as
'{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ["Barlow's 1-yard touchdown run", '2-yard touchdown run', 'by rookie RB Joseph Addai.']}'
In the csv, it looks like
{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ["Barlow's 1-yard touchdown run", '2-yard touchdown run', 'by rookie RB Joseph Addai.']}
After conversion to a valid JSON string and printed onto the console, it looks like
'{"number": "", "date": {"day": "", "month": "", "year": ""}, "spans": ["Barlow"s 1-yard touchdown run", "2-yard touchdown run", "by rookie RB Joseph Addai."]}'
To convert the strings to dictionaries, I tried using the json.loads(string) method.
Here is what I did:
for i in range(df.shape[0]):
dict = df.iloc[i]['answer']
dict = dict.replace("'", '"')
# To convert it to a valid JSON string
dict = json.loads(dict)
ans[i] = dict['number']
The following error appears for the third example given above, but not the other two:
JSONDecodeError: Expecting ',' delimiter: line 1 column 80 (char 79)
It fails to convert the string into a dictionary for reasons unknown to me.
What can I do to rectify this error?
Is there any method to read the ‘answer’ column as a dictionary, instead of having to read it as a string and then convert said string to a dictionary?
2
Answers
I came to know that the format of the strings while looking like a JSON dictionary, matched perfectly with the format of a Python dictionary and could easily be parsed with the in-built eval() function.
Using the template of Serge above,
Which printed
with absolutely no errors.
As I mentioned in my comment, your error
JSONDecodeError: Expecting ',' delimiter: line 1 column 80 (char 79)
comes from the'Barlow"s'
. Now, to avoid these errors you’ll need to replace any instances of"
occuring between letters. Here is a function that will handle these cases.This will gives you:
Here is the print-out of the changes