skip to Main Content

I’m trying to write a program that will stream tweets from Twitter using their Stream API and Tweepy. Here’s the relevant part of my code:

def on_data(self, data):
    if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":
        self.filename = trump.csv

    elif data.user.id == "30354991" or data.in_reply_to_user_id == "30354991":
        self.filename = harris.csv

    if not 'RT @' in data.text:
        csvFile = open(self.filename, 'a')
        csvWriter = csv.write(csvFile)

        print(data.text)
        try:
            csvWriter.writerow([data.text, data.created_at, data.user.id, data.user.screen_name,  data.in_reply_to_status_id])

        except:
            pass

def on_error(self, status_code):
    if status_code == 420:
        return False

What the code should be doing is streaming the tweets and writing the text of the tweet, the creation date, the user ID of the tweeter, their screen name, and the reply ID of the status they’re replying to if the tweet is a reply. However, I get the following error:

File "test.py", line 13, in on_data

 if data.user.id == "25073877" or data.in_reply_to_user_id == "25073877":

AttributeError: 'unicode' object has no attribute 'user'

Could someone help me out? Thanks!

EDIT: Sample of what is being read into “data”

{"created_at":"Fri Feb 15 20:50:46 +0000 2019","id":1096512164347760651,"id_str":"1096512164347760651","text":"@realDonaldTrump nhttps://t.co/NPwSuJ6V2M","source":"u003ca href="http://twitter.com" rel="nofollow"u003eTwitter Web Clientu003c/au003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":25073877,"in_reply_to_user_id_str":"25073877","in_reply_to_screen_name":"realDonaldTrump","user":{"id":1050189031743598592,"id_str":"1050189031743598592","name":"Lauren","screen_name":"switcherooskido","location":"United States","url":null,"description":"Concerned citizen of the USA who would like to see Integrity restored in the US Government. Anti-marxist!nSigma, INTP/JnREJECT PC and Identity Politics #WWG1WGA","translator_type":"none","protected":false,"verified":false,"followers_count":1459,"friends_count":1906,"listed_count":0,"favourites_count":5311,"statuses_count":8946,"created_at":"Thu Oct 11 00:59:11 +0000 2018","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"FF691F","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http://pbs.twimg.com/profile_images/1068591478329495558/ng_tNAXx_normal.jpg","profile_image_url_https":"https://pbs.twimg.com/profile_images/1068591478329495558/ng_tNAXx_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/1050189031743598592/1541441602","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https://t.co/NPwSuJ6V2M","expanded_url":"https://www.conservativereview.com/news/5-insane-provisions-amnesty-omnibus-bill/","display_url":"conservativereview.com/news/5-insane-u2026","indices":[18,41]}],"user_mentions":[{"screen_name":"realDonaldTrump","name":"Donald J. Trump","id":25073877,"id_str":"25073877","indices":[0,16]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"und","timestamp_ms":"1550263846848"}

So I supposed the revised question is how to tell the program to only write parts of this JSON output to the CSV file? I’ve been using the references Twitter’s stream API provides for the attributes for “data”.

2

Answers


  1. As stated in your comment the tweet data is in “JSON format”. I believe what you mean by this is that it is a string (unicode) in JSON format, not a parsed JSON object. In order to access the fields like you want to in your code you need to parse the data string using json.

    e.g.

    import json
    
    json_data_object = json.loads(data)
    

    you can then access the fields like you would a dictionary e.g.

    json_data_object['some_key']['some_other_key']
    
    Login or Signup to reply.
  2. This is a very late answer, but I’m answering here because this is the first search hit when you search for this error. I was also using Tweepy and found that the JSON response object had attributes that could not be accessed.

    'Response' object has no attribute 'text'
    

    Through lots of tinkering and research, I found that in the loop where you access the Twitter API, using Tweepy, you must specify ‘.data’ in the loop, not within it.
    For example:

    tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
    for tweet in tweets:
      print(tweet.text) # or print(tweet.data.text)
    

    Will not work because the Response variable doesn’t have access to the attributes within the JSON response object. Instead, you do something like:

    tweets = client.search_recent_tweets(query = "covid" , tweet.fields = ['text'])
    for tweet in tweets.data:
      print(tweet.text)
    

    Basically, this was a long-winded way to fix a problem I was having for a long time. Cheers, hopefully, other noobs like me won’t have to struggle as long as I did!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search