I’m quite new to python and coding in general, and I’m having difficulty understanding how to interact with streamed data from the Twitter API using Tweepy.
Here’s my example code which prints out any new tweet that the specified user makes.
import tweepy
auth = tweepy.OAuthHandler("****", "****")
auth.set_access_token("****", "****")
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
print (status.text)
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener = myStreamListener)
myStream.filter(follow=['user_id_goes_here']))
If I want to do something such as check if a certain word exists inside each tweet as they are made, I do not know how, given it is a constant stream of data.
How does one analyze each tweet as it is delivered and parse it?
2
Answers
I believe I have found what I am looking for, the analysis of the status needs to happen within the on_status method, for example:
The tweepy documentation on streaming is very limited, but it does say
so searching for that file in the tweepy github repository we find
https://github.com/tweepy/tweepy/blob/master/tweepy/streaming.py
There you can find the method
on_status
and see thatstatus
should be an instance of the classStatus
Looking at the twitter API documentation reveals that
Unfortunately, looking at the source code for
Status
or the tweepy documentation does not yield much information.Looking at the twitter API documentation for
tweet
https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet
We should expect a field called
text
that should be the tweet actual textAnother thing we can try is just using a breakpoint and then looking at the variable
status
using the debugger in order to see what fields it has (this is done a lot of times in python due to it’s dynamic nature)