I am working on a twitter bot to stream tweets based on certain keywords and forward to Telegram. The keywords are stored in an excel file and can be changed by the user. My current approach is:
- Instantiate a sub-classed tweepy.Stream object with modified on_status method.
- Start 3 threads in main:
-
Thread 1 checks the excel file for changes in keywords and updates a keyword_queue accordingly.
-
Thread 2 runs this function to stream tweets:
def stream_tweets(keywords_queue, stream): while True: search_keywords = keywords_queue.get() print("Search keywords for filter: {}".format(search_keywords)) if search_keywords: stream.filter(track=search_keywords)
-
Thread 3 runs a routine to forward tweets to telegram.
-
The problem is in stream_tweets function. According to tweepy’s implementation, once stream.filter is called, thread execution is stalled there until the connection is closed due to any reason. This does not work well with my requirements because I need to be able to modify the arguments passed to the track parameter in stream.filter (search_keywords). But since the thread is stalled, the search_keywords list is not updated according to the data supplied by the thread 1.
Once possible workaround is to disconnect the stream everytime Thread 1 notices a change in keywords file, and then reconnect. But frequent disconnections result in errors. Another solution that I thought of was using the on_status method to filter tweets once again before passing them to the telegram Thread 3, but that kind of defeats the purpose of stream.filter()
Is there any recommended way to do this? This is my 2nd time using threading so please be kind.
Cheers 🙂
2
Answers
There’s no way to change the parameters of a stream after connecting to the POST statuses/filter endpoint.
This is a Twitter API limitation.
This would be the only way too change the stream.
Yes, reconnecting too frequently can get you rate-limited. One way to handle that would be to add a delay of one or a few min. before disconnecting and reconnecting with the new parameters.
See the Twitter API documentation on connecting.
I think the API v2 filtered stream allows adding and removing rules while connected to the stream, but Tweepy doesn’t support that yet. For that, see https://github.com/tweepy/tweepy/issues/1472.
You need to get an infinite loop to check if there is any changes in keywords from your excel file. If there is any change, you need to check if the listener is running to disconnect it if so. After that, you can start listening again with your new keywords.
To check if is running:
Your code would be: