I have a code that gives me the tweets from my timeline on Twitter and saves them to a CSV. How can I make it search and save only tweets that contain a specific keyword X?
The code is below:
access_token = config['twitter']['access_token']
access_token_secret = config['twitter']['access_token_secret']
auth = tweepy.OAuthHandler(api_key, api_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
data = []
for tweet in public_tweets:
data.append([tweet.created_at, tweet.user.screen_name, tweet.text])
2
Answers
Python provides the
in
operator for words in strings, so you don’t have to use regex or something more involved than a simpleif
, as per the following:The simplest approach would be to check
if keyword in tweet.text
, but you’ll get false positives (e.g.baseball
will match ifkeyword='ball'
). The better approach can use regex:Here
b
refers to word boundary and|
separates words in group. So we search for any of keywords if they’re not part of some larger word.re.compile
is used only to speed things up and not to recompile it for every iteration. List comprehension is just more readable IMO comparing to.append()
in a loop (and also faster).