I collected some twitter data doing this:
#connect to twitter API
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)
#set radius and amount of requests
N=200 # tweets to request from each query
S=200 # radius in miles
lats=c(38.9,40.7)
lons=c(-77,-74)
roger=do.call(rbind,lapply(1:length(lats), function(i) searchTwitter('Roger+Federer',
lang="en",n=N,resultType="recent",
geocode=paste (lats[i],lons[i],paste0(S,"mi"),sep=","))))
After this I’ve done:
rogerlat=sapply(roger, function(x) as.numeric(x$getLatitude()))
rogerlat=sapply(rogerlat, function(z) ifelse(length(z)==0,NA,z))
rogerlon=sapply(roger, function(x) as.numeric(x$getLongitude()))
rogerlon=sapply(rogerlon, function(z) ifelse(length(z)==0,NA,z))
data=as.data.frame(cbind(lat=rogerlat,lon=rogerlon))
And now I would like to get all the tweets that have long and lat values:
data=filter(data, !is.na(lat),!is.na(lon))
lonlat=select(data,lon,lat)
But now I only get NA values…. Any thoughts on what goes wrong here?
3
Answers
Assuming that some tweets were downloaded, there are some geo-referenced tweets and some tweets without geographical coordinates:
Let’s simulate
data
between your longitude/latitude points for simplicity.Rows with longitude/latitude data can be selected by removing the 10 rows with missing data.
The last line replaces the last two lines of your code. However, note that this only works if the missing geographical coordinates are stored as
NA
.Not necessarily an answer, but more an observation too long for comment:
First, you should look at the documentation of how to input geocode data. Using
twitteR
:Geodata should be structured like this (lat, lon, radius):
And then called using:
Then, I would instead use
twListtoDF
to filter:Which now gives you a data.frame with 16 cols and 200 observations (set above).
You could then filter using:
That said (and why this is an observation vs. an answer) – it looks as though
twitteR
does not return lat and lon (it is all NA in the data I returned) – I think this is to protect individual users locations.That said, adjusting the radius does affect the number of results, so the code does have access to the geo data somehow.
As Chris mentioned,
searchTwitter
does not return the lat-long of a tweet. You can see this by going to the twitteR documentation, which tells us that it returns astatus
object.Status Objects
Scrolling down to the status object, you can see that 11 pieces of information are included, but lat-long is not one of them. However, we are not completely lost, because the user’s screen name is returned.
If we look at the user object, we see that a user’s object at least includes a location.
So I can think of at least two possible solutions, depending on what your use case is.
Solution 1: Extracting a User’s Location
Solution 2: Multiple searches with limited radius
The other solution would be to conduct a series of repeated searches, increment your latitude and longitude with a small radius. That way you can be relatively sure that the user is close to your specified location.