I have a Python code that extracts Twitter data via the streaming API. I would like to use separate files for each day so I would like to have the script running for 24 hours, then kill it and restart it as with a restart of the program the name of the file will change.
How can I ensure that the script is stopped at 00:00 and restarts right away?
The code can be found below. If you have any other ideas about how I can create a new text file daily, this would be even better.
import tweepy
import datetime
key_words = ["xx"]
twitter_data_title = "".join([xx, "_", date_today, ".txt"])
class TwitterStreamer():
def __init__(self):
pass
def stream_tweets(self, twitter_data_title, key_words):
listener = StreamListener(twitter_data_title)
auth = tweepy.OAuthHandler(api_key, api_secret_key)
auth.set_access_token(access_token, access_secret_token)
stream = tweepy.Stream(auth, listener)
stream.filter(track=key_words)
class StreamListener(tweepy.StreamListener):
def __init__(self, twitter_data_title):
self.fetched_tweets_filename = twitter_data_title
def on_data(self, data):
try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
tf.write(data)
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True
def on_exception(self, exception):
print('exception', exception)
stream_tweets(twitter_data_title, key_words)
def on_error(self, status):
print(status)
def stream_tweets(twitter_data_title, key_words):
listener = StreamListener(twitter_data_title)
auth = tweepy.OAuthHandler(api_key, api_secret_key)
auth.set_access_token(access_token, access_secret_token)
stream = tweepy.Stream(auth, listener)
stream.filter(track=key_words)
if __name__ == '__main__':
twitter_streamer = TwitterStreamer()
twitter_streamer.stream_tweets(twitter_data_title, key_words)
2
Answers
I would add this to your code:
It looks like the ‘blocking’ code in your example comes from another library, so you don’t have the opportunity to (easily) change the inner loop to check for a condition and exit.
Using a Background Process (Not Ideal)
You could change your entry point to start the code in a background process, and check to see if the file’s title should have changed:
Changing
StreamListener
A better alternative would probably be to encode the knowledge of the date into
StreamListener
. Instead of passing a file name (twitter_data_title
), pass a file prefix (xx
from your example), and build the filename in a property:Since
StreamListener.on_data
grabs the file name fromself.fetched_tweets_filename
, this should mean the tweets are written to the new file when the date changes.