skip to Main Content

I have a json file with all the data I searched using Twitter API. I only want to keep certain keys in that file and delete the rest of them, but don’t know how to do it correctly. I wrote some code to do so, but I get the error "list indices must be integers or slices, not str"

import json

def read_json(json_file:str)->list:
    tweets_data=[]
    for tweet in open(json_file, 'r'):
        tweets_data.append(json.loads(tweet))
    return tweets_data
tags = ["created_at", "full_text"]
tweet_list=read_json('test.json')

for i in tweet_list["tweet"].keys():
    if i not in tags:
        del tweet_list["tweet"][i]
        
print (tweet_list[0])

2

Answers


  1. def read_json(json_file:str)->list:
        tweets = []
        with open(json_file, 'r') as f:
            for line in f:
                tweets.append(json.loads(line))
        return tweets
    tags = ['created_at', 'full_text']
    tweets = read_json('tweets.json')
    for tweet in tweets:
        for key in list(tweet.keys()):
            if key not in tags:
                del tweet[key]
    
    Login or Signup to reply.
  2. You’ve gotten really close to something working, but have a bug in the filtering section.

    Your read_json() function returns a python list, which is assigned to tweet_list.

    The error message refers to tweet_list['tweet'], and says that you must use integers or pairs of numbers (slices) to select from within a list. Not a string tweet.

    If you change your code to use the number 0

    for i in tweet_list[0].keys():
        if i not in tags:
            del tweet_list[0][i]
    

    It will filter out the unwanted tags from the first element in the list.

    To filter every element in the list, you need to iterate over all them with aother for loop.

    for i in range(tweet_list):
        for key in tweet_list[i].keys():
            if key not in tags:
                del tweet_list[i][key]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search