So i currently trying to mine tweets from Twitter account(s), but i wanted to exclude the retweets so i can get 200 of Tweets only data for my project. Currently I have a working code to mine the data feed, but still have Re-Tweets included. I have founded that to exclude Re-Tweets you need to put
-RT
in the code but i simply do not know where since i am pretty new to programming.
(Currently using Twitter API for Python (Tweepy) with Python 3.6 using Spyder.)
import tweepy
from tweepy import OAuthHandler
import pandas as pd
consumer_key = 'consumer_key'
consumer_secret = 'consumer_secret'
access_token = 'access_token'
access_secret = 'access_secret'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
api = tweepy.API(auth)
screen_name='screen_name'
tweets = api.user_timeline(screen_name, count=200)
save=['']*len(tweets)
for i in range(len(tweets)):
save[i]=tweets[i].text
print(tweets[i].text)
data = pd.DataFrame(save)
data.to_csv("results.csv")
Can anyone help me, preferrably with complete section for the code to remove the Retweets. Thank you very much
2
Answers
Faced the same issue back when i was using tweepy to retrieve tweets from twitter, what worked for me was that i used the twitter’s api with inbuilt request i.e. http requests.
To exclude retweets you could pass -RT operator in query parameter .
Documentation to this api .
Change this line in your code:
to the following:
This Twitter doc may be helpful: https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html