skip to Main Content

My aim is to extract old tweets for the entire month of Jan 2017 for New York City (‘locations’:’-74,40,-73,41′) using python. I am able to get the live streaming tweets using the following code:

    import json
    import pandas as pd
    import numpy as np
    from TwitterAPI import TwitterAPI

    #Set up the variables for the 'application'
    consumerkey = 'cfKguErYawo2WB7cfNtAT2lKl'
    consumersecret = 'my_consumer_secret'
    access_token_key = '2195434704-Wov69oF2iIBRgUjWJhD0KThqcLApYCJXqtbYI4K'
    access_token_secret = 'my_access_token_secret'

    #Setup the API key
    api = TwitterAPI(consumerkey,consumersecret,access_token_key,access_token_secret)

    # Breaking after extracting 10 live tweets from New York City

    r = api.request('statuses/filter', {'locations':'-74,40,-73,41'})
    for row,item in enumerate(r):
        print(row, item['text'])
        if row >= 10:
            break

But this is not what I am looking for. Can someone suggest how to extract the old tweets for this location filter using Twitter streaming API or any other package in python?
Thanks!

2

Answers


  1. You can accomplish part of what you are asking using Twitter’s REST API. Below is an example that uses the TwitterAPI package that you used to stream with. However, when you are searching for old tweets there are some restrictions. You can only get about a week’s worth of old tweets. Also, you must supply a search string (with the q parameter) regardless whether or not you supply a location. You will only see results that match both the string and the location. When you stream, you can supply a filter string or a location or both. In this case, the results can match either the string or the location but not necessarily both.

    This code will download tweets until you reach the roughly one week limit. It does this by making successive requests which are timed so as not to exceed Twitter’s rate limit. You might also find the TwitterGeoPics package useful.

    from TwitterAPI import TwitterAPI, TwitterRestPager
    
    SEARCH_TERM = 'pizza'
    GEOCODE = '40,74,10km'
    
    CONSUMER_KEY = ''
    CONSUMER_SECRET = ''
    ACCESS_TOKEN_KEY = ''
    ACCESS_TOKEN_SECRET = ''
    
    api = TwitterAPI(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN_KEY, ACCESS_TOKEN_SECRET)
    
    pager = TwitterRestPager(api, 'search/tweets', {'q': SEARCH_TERM, 'geocode':GEOCODE})
    
    for item in pager.get_iterator():
        print(item['text'] if 'text' in item else item)
    
    Login or Signup to reply.
  2. Now, you can not extract 30 days old tweets using Twitter’s streaming API. Twitter has made it payable.

    You can extract past 30 days tweets using search-30day subscription plan of Twitter premium API.

    Also, you can buy Twitter premium subscription only if you have approved Twitter’s developer’s account.

    To get an approval, you can see this link: https://developer.twitter.com/en/apply-for-access.html

    If you have one-time requirement of this, then I will suggest you to use third-party services such as TrackMyhashtag.com or Tweetreach.com

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search