Analyzing streamed data from Twitter API using Tweepy

Dude
August 16, 2021
224 views
0 votes
2 Answers

I’m quite new to python and coding in general, and I’m having difficulty understanding how to interact with streamed data from the Twitter API using Tweepy.

Here’s my example code which prints out any new tweet that the specified user makes.

import tweepy

auth = tweepy.OAuthHandler("****", "****")
auth.set_access_token("****", "****")

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)


class MyStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print (status.text)

myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener = myStreamListener)
    
myStream.filter(follow=['user_id_goes_here']))

If I want to do something such as check if a certain word exists inside each tweet as they are made, I do not know how, given it is a constant stream of data.

How does one analyze each tweet as it is delivered and parse it?

Answers

Chosen as BEST ANSWER
- Dude
- August 16, 2021 at 12:53 pm
- 0 votes
0
I believe I have found what I am looking for, the analysis of the status needs to happen within the on_status method, for example:
```
class MyStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        if keyword in status.text:
            print ("keyword found") 
```

(Edit)

- OranShuster
- August 16, 2021 at 11:32 am
- 0 votes
0
The tweepy documentation on streaming is very limited, but it does say

This page aims to help you get started using Twitter streams with Tweepy by offering a first walk through. Some features of Tweepy streaming are not covered here. See streaming.py in the Tweepy source code.

so searching for that file in the tweepy github repository we find
https://github.com/tweepy/tweepy/blob/master/tweepy/streaming.py

There you can find the method on_status and see that status should be an instance of the class Status

Looking at the twitter API documentation reveals that

Tweets are also known as “status updates.”

Unfortunately, looking at the source code for Status or the tweepy documentation does not yield much information.

Looking at the twitter API documentation for tweet
https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet
We should expect a field called text that should be the tweet actual text

Another thing we can try is just using a breakpoint and then looking at the variable status using the debugger in order to see what fields it has (this is done a lot of times in python due to it’s dynamic nature)

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.