skip to Main Content

I want all tweets from May 2013 to May 2014 containing a given word.

I looked at the API documentation for GET search/tweets, but it seems it doesn’t allow you to give a time window, only a date and it will retrieve tweets up to 7 days before.

How can I retrieve those tweets in python? (Basically I want to write a script that does what the Twitter advanced search does

2

Answers


  1. According to the The Twitter Search API documentation, the query you want is not possible: https://dev.twitter.com/rest/public/search

    The Twitter Search API searches against a sampling of recent Tweets
    published in the past 7 days.

    Beyond the last 7 days, what you want to achieve can only be done through manually searching an account on Twitter.

    You could try twarc
    with the advanced search operators you referenced but I am not sure if it will query an entire year per the Twitter Search API documentation.

    Although not Python based, one alternative would be to use https://webrecorder.io/

    Scroll to the time you want to record or attempt to capture the entire feed. Note the auto scrolling option as well.

    Login or Signup to reply.
  2. You are going to have to dump your Twitter feed to JSON and parse it for the tweets you want. I just put this together for you in Python using the tweepy and json modules.

    #!/usr/bin/env python
    
    import tweepy
    from tweepy import OAuthHandler
    import json
    
    
    def process_or_store(tweet):
        converted = json.dumps(tweet)
        parsed = json.loads(converted)
        return parsed
    
    
    access_token = ''
    access_secret = ''
    consumer_key = ''
    consumer_secret = ''
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_secret)
    api = tweepy.API(auth)
    
    for tweet in tweepy.Cursor(api.user_timeline).items():
        j = process_or_store(tweet._json)
        m2013 = ['May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', '2013']
        m2014 = ['Jan', 'Feb', 'Mar', 'Apr', 'May', '2014']
    
        if all(x in j['created_at'] for x in m2013):
            print "%s -- %s" % (j['created_at'], j['text'])
        elif all(x in j['created_at'] for x in m2014):
            print "%s -- %s" % (j['created_at'], j['text'])
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search