I am trying to filter the tweet stream based on both input keyword and location (variables).
If I want to filter tweets based on keyword and a certain location, locations parameters in line below
myStream.filter(track=keywords_to_track, locations=boundary_box)
should be a boundary box with four coordinates of the input location (maxlog, minlog, maxlat, minlag)
how to get boundary_box for a given location(variable)?
or
is there any other way to solve this issue?
I have also tried https://www.mapdevelopers.com/geocode_bounding_box.php, but it’s not working.
I am new to tweepy API.
# arguments
topic_name = 'kafkatwitter_1'
#input variables
keywords_to_track = ['modi']
location_filter = 'New Delhi'
# twitter authorization
auth = OAuthHandler(API_KEY, API_KEY_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
# init tweepy
api = tweepy.API(auth)
producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
value_serializer=lambda x: dumps(x).encode('utf-8'),
api_version=(0, 10, 1))
class MyStreamListener(tweepy.Stream):
def on_status(self, tweet):
length = len(tweet.text.split(' '))
if (tweet.lang != 'en') or (length <= 10):
pass
print("==filtered==")
else:
message = {
"text": tweet.text,
"created_at": process_time(tweet.created_at),}
producer.send(topic_name, value=message)
# Step 2: Creating a Stream
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth=api.auth, listener=myStreamListener)
# Step 3: Starting a Stream
myStream.filter(track=keywords_to_track, locations=boundary_box)
2
Answers
You could use the Nominatin API
Search parameters include:
Example:
Response includes boundingbox:
https://boundingbox.klokantech.com/
[-122.75,36.8,-121.75,37.8]
is a location box for San Francisco.[-74,40,-73,41]
is a location box for New York City.[-122.75,36.8,-121.75,37.8,-74,40,-73,41]
are location boxes for San Francisco and New York City.Assign that to the variable
boundary_box
.The
locations
parameter expects and array of floats. Amount of floats is dividable by 4.