skip to Main Content

I am using the get_friends function of rtweet package to get the list of user_id‘s of the friends of a set of focal users who are sampled from participants in a Twitter discourse. The function returns a list of tibbles.

Each tibble has two columns – one with the focal user’s user_id and the second with user_id‘s of the focal users friends. Since every user has different number of friends, the number of rows in each tibble is different.

My problem: The accounts of some of the focal users are now non-existent due to reasons unknown. Because of this the list has empty tibbles which look like this:

> userFriends[[88]]
# A tibble: 0 x 0

A non-empty tibble looks like this:

> userFriends[2]
[[1]]
# A tibble: 32 x 2
                 user            user_id
                <chr>              <chr>
 1 777937999917096960           49510236
 2 777937999917096960           60489018
 3 777937999917096960         3190203961
 4 777937999917096960          118756393
 5 777937999917096960         2338104343
 6 777937999917096960          122453931
 7 777937999917096960          452830010
 8 777937999917096960           60937837
 9 777937999917096960 923106269761851392
10 777937999917096960          416882361
# ... with 22 more rows

I want my code to identify these empty tibbles and subset the list without these tibbles.

I used the nrow function on these tibbles to find the number of friends each focal user had.

nFriends <- as.numeric(lapply(userFriends, nrow))

I took the indices where this value is zero as the empty tibbles and removed them using subsetting technique as follows:

nullIndex <- nFriends!=0
userFriendsFinal <- userFriends[nullIndex]

This seems to work as of now. But this way I also removing users with zero friends (although very unlikely) along with users who no longer exist or accessible through the API. I want to make sure that I am removing only those who are not accessible or do not exist.
Please help.

2

Answers


  1. Hi you can use the discard function from the purrr package:

    Here is small example:

    library(purrr)
    mylist <- list( a = tibble(n = numeric()),
          b = tibble(n = 1:4))
    discard(mylist, function(z) nrow(z) == 0)
    $b
    # A tibble: 4 x 1
          n
      <int>
    1     1
    2     2
    3     3
    4     4
    
    Login or Signup to reply.
  2. We can use Filter with nrow, which will remove all entries with 0 number of rows, i.e.

    Filter(nrow, userFriends)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search