skip to Main Content

So i currently trying to mine tweets from Twitter account(s), but i wanted to exclude the retweets so i can get 200 of Tweets only data for my project. Currently I have a working code to mine the data feed, but still have Re-Tweets included. I have founded that to exclude Re-Tweets you need to put
-RT in the code but i simply do not know where since i am pretty new to programming.

(Currently using Twitter API for Python (Tweepy) with Python 3.6 using Spyder.)

import tweepy
from tweepy import OAuthHandler
import pandas as pd

consumer_key = 'consumer_key'
consumer_secret = 'consumer_secret'
access_token = 'access_token'
access_secret = 'access_secret'

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

api = tweepy.API(auth)
screen_name='screen_name'
tweets = api.user_timeline(screen_name, count=200)
save=['']*len(tweets)

for i in range(len(tweets)):
save[i]=tweets[i].text
print(tweets[i].text)

data = pd.DataFrame(save)
data.to_csv("results.csv")

Can anyone help me, preferrably with complete section for the code to remove the Retweets. Thank you very much

2

Answers


  1. Faced the same issue back when i was using tweepy to retrieve tweets from twitter, what worked for me was that i used the twitter’s api with inbuilt request i.e. http requests.
    To exclude retweets you could pass -RT operator in query parameter .

    Documentation to this api .

    Login or Signup to reply.
  2. Change this line in your code:

    tweets = api.user_timeline(screen_name, count=200)
    

    to the following:

    tweets = api.user_timeline(screen_name, count=200, include_rts=False)
    

    This Twitter doc may be helpful: https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search