I want to load a json mined from twitter api into python. Attached is sample of json object:
{"created_at":"Mon Apr 22 18:17:09 +0000 2019","id":1120391103813910529,"id_str":"1120391103813910529","text":"On peut dire que la base de cette 8e saison est en place ud83dude4c #GOTS8E2","source":"u003ca href="http://twitter.com/download/iphone" rel="nofollow"u003eTwitter for iPhoneu003c/au003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":243071138,"id_str":"243071138","name":"Mr B","screen_name":"skeyos","location":"Namur","url":null,"description":null,"translator_type":"none","protected":false,"verified":false,"followers_count":197,"friends_count":1811,"listed_count":6,"favourites_count":7826,"statuses_count":8044,"created_at":"Wed Jan 26 06:49:05 +0000 2011","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"fr","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http://pbs.twimg.com/profile_images/493833348167770112/aGLGemZ5_normal.jpeg","profile_image_url_https":"https://pbs.twimg.com/profile_images/493833348167770112/aGLGemZ5_normal.jpeg","profile_banner_url":"https://pbs.twimg.com/profile_banners/243071138/1406574068","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"GOTS8E2","indices":[59,67]}],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"fr","timestamp_ms":"1555957029666"}
{"created_at":"Mon Apr 22 18:17:14 +0000 2019","id":1120391124722565123,"id_str":"1120391124722565123","text":"...
I am trying the following code:
with open('tweets.json') as tweet_data:
json_data = json.load(tweet_data)
But get the following error:
JSONDecodeError: Extra data: line 3 column 1 (char 2149)
Unfortunately it is not possible for me to edit the json object too much, as it is really big. I need to figure out how to read this into Python. Any help would be greatly appreciated!
Edit: It works with the following code:
dat=list()
with open ('data_tweets_E2.json', 'r') as f:
for l in f.readlines():
if not l.strip (): # skip empty lines
continue
json_data = json.loads (l)
dat.append(json_data)
3
Answers
Every line contains a new object, so try parsing them line by line.
Each line contains a separate json object, parse and store them into a list:
Here is the code.You need to install Pandas first of course. If the solution helped you please mark this answer with the green check.
So as you can see
print(data_list)
prints out a list andprint(tweet_data_frame)
prints out dataframe.If you want to see the types of these variables just use type()
print(type(data_list))
Important: What I tried to tell you is that your JSON file has bad format and a lot of mistakes. If you have more JSON objects they need to be in array
[{"example":"value"},{"example":"value"}]
. Your JSON file has errors. Try it with different JSON file.