I’ve this following code that retrieves Twitter Streaming data and crete a JSON file. What I’d like to get is to stop the data collecting after fo eg.1000 tweets. How can I set the code?
#Import the necessary methods from tweepy library
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
# Other libs
import json
#Variables that contains the user credentials to access Twitter API
access_token = "XXX"
access_token_secret = "XXX"
consumer_key = "XXX"
consumer_secret = "XXX"
#This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):
def on_data(self, data):
try:
tweet = json.loads(data)
with open('your_data.json', 'a') as my_file:
json.dump(tweet, my_file)
except BaseException:
print('Error')
pass
def on_error(self, status):
print ("Error " + str(status))
if status == 420:
print("Rate Limited")
return False
if __name__ == '__main__':
#This handles Twitter authetification and the connection to Twitter Streaming API
l = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, l)
stream.filter(track=['Euro2016', 'FRA', 'POR'], languages=['en'])
2
Answers
Here is a possible solution:
After having defined the class variable
tweet_number
, I used the init() method to initialize a newStdOutListener
object with the maximum number of tweets you want to collect.tweet_number
is increased by 1 each time theon_data(data)
method is called, causing the program to terminate whentweet_number>=max_tweets
P.S. You need to import
sys
for the code to work.This is the 2.7 code I would use — sorry, I do not know 3.0 as well… I think you want what is is on my second line. .items(1000) part…?
stackoverflow messed up my indentations in my code. I am also using tweepy.
CODE: