Tweepy for Twitter API v2 - Extracting Additional Fields for Tweet Search

GKroch
May 10, 2022
199 views
2 votes
3 Answers

I started playing around with Twitter API v2 in Tweepy. I’ve had some experience with v1 but it looks like it’s changed a bit.

I’m trying to search tweets based on my query and later extract some meaningful information.
The code is following:

response = client.search_recent_tweets(
    "innovation -is:retweet lang:pl", 
    max_results = 100, 
    tweet_fields = ['author_id','created_at','text','source','lang','geo'],
    user_fields = ['name','username','location','verified'],
    expansions = ['geo.place_id', 'author_id'],
    place_fields = ['country','country_code']
)

Now, the issue is I’m not really sure how to read the output. I can easily access basic info with tweet object in the following way:

for tweet in response.data:
    print(tweet.text)
    print(tweet.lang)
    etc..

But how do I access other information, such as user_id for tweet object? As this information is in second list of response => response.includes['user']

There are no unique ids (at least I don’t see them) to match this info with info from response.data

Below I’m adding an example output of my code. Response consists of iterables for data, includes, errors and meta. The thing is, the iterables don’t seem to be always equal in size, meaning that I can’t just take data[0] and includes['user'][0] etc.

Answers

The response of the Twitter API looks like that:

{
    "data": [
        {
            "id": "...",
            "author_id": "2244994945",
            "geo": {
                "place_id": "01a9a39529b27f36"
            },
        }
    ],
    "includes": {
        "users": [
            {
                "id": "2244994945",
                "created_at": "..."
            }
        ],
        "places": [
            {
                "id": "01a9a39529b27f36",
                "country": "..."
            }
        ]
    }
}

So you should have in each tweet:

The author_id field which is the id of the User object in the includes ;
The geo['place_id'] field which is the id of the Place object in the includes.

- Zyy
- May 13, 2022 at 12:49 pm
- 0 votes
0
Tweepy is a great tool for working with Twitter API: I use it myself as well.
Under the hood the method you are using accesses the search recent tweets api
As you can see in the Examples section, the API itself definitely provides an author id in the response data. Which means that Tweepy has it saved as well.

What you’re actually seeing in the screenshot you’ve provided is the string representation of the Tweepy objects. This does not mean that the data is not there, however.

Here’s a slightly modified version of your code:
```
import tweepy

client = tweepy.Client("YOUR BEARER TOKEN HERE")

response = client.search_recent_tweets(
    "innovation -is:retweet lang:pl",
    max_results = 100,
    tweet_fields = ['author_id','created_at','text','source','lang','geo'],
    user_fields = ['name','username','location','verified'],
    expansions = ['geo.place_id', 'author_id'],
    place_fields = ['country','country_code']
)

for tweet in response.data:
    print(tweet.author_id)      # print the author id of the tweet
    print(tweet.text)           # print the text
    print(tweet.data['lang'])   # print the language (PL, since we're filtering by it)
    print(tweet.data['source']) # what did the user use to publish the tweet?
```
Hope that helps 🙂
Login or Signup to reply.

- Kat
- September 2, 2022 at 5:37 pm
- 0 votes
0
When initializing your tweepy Client, set the return type to dict
```
client = tweepy.Client(
    bearer_token=bearer_token,
    consumer_key=api_key,
    consumer_secret=api_secret,
    access_token=access_token,
    access_token_secret=access_secret,
    return_type = dict
)
```
Then search_recent_tweets() will return a dictionary
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Tweepy for Twitter API v2 – Extracting Additional Fields for Tweet Search

Answers