I’m very new to Python and currently undertaking a personal project using downloaded JSON files from Facebook, all from one chat on messenger. I keep getting an error saying df is not defined, despite me defining df.
I have multiple JSON files that I am trying to read into one dataframe. I created a loop so I could do this, with my defining the df within that loop. When I then call df to see it, it says its not defined. My code is below:
import pandas as pd
import json, glob, os
import numpy as np
file_path = "work/Desktop/fb data/jsonmssg/message_1.json"
file_dir = "work/Desktop/fb data/jsonmssg"
json_pattern = os.path.join(file_dir, '*.json')
file_list = glob.glob(json_pattern)
dfs = []
for f in file_list:
with open(file) as file:
chat_history = json.loads(file.read())
json_data = pd.json_normalize(chat_history['messages'])
dfs.append(json_data)
df = pd.concat(dfs)
df
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-74-00cf07b74dcd> in <module>
----> 1 df
NameError: name 'df' is not defined
Does anyone know how I can fix this?
I tried creating a loop that would look through all JSON files in the same directory, expecting it would concatenate them into one dataframe. It didnt work.
2
Answers
First of all, when trying to get a list of path names to files that are in different directories, including directories inside directories, you should use the
recursive
flag. I’m unsure if that is your case, though.Secondly, because of the way you define your path, the pathname itself that you put into glob.glob() could be invalid. That’s because different operating systems have different path name conventions, which could affect
os.path.join()
. For example, on my OSos.path.join('work/Desktop/fb data/jsonmssg', '*.json')
results in'work/Desktop/fb data/jsonmssg\*.json'
. Note the incompatible'/'
and'\'
.You could relay on os.path.join() for building proper pathname for your OS. For example
Before proceeding, be sure to check whether the path is correct with
os.path.exists(json_pattern)
.First of all, I’m not sure glob is your best bet here. At the very least, it’s not what I would have done… you only need glob to "**" directories.
How about:
Hope that helps simplify your life 🙂