skip to Main Content

I would like to convert a list like this into a tibble.

lst <- list(
  "A"=list(
    "Category"="A",
    "Team"=c("x"),
    "City"="N"
  ),
  "C"=list(
    "Category"="C",
    "Team"=c(
      "x","v","z"
    ),
    "City"="V"
  ),
  "F"=list(
    "Category"="F",
    "Team"=c("o")
  ),
  "K"=list(
    "Category"="K",
    "Team"=c(
      "f","h","y"
    ),
    "City"="T"
  ),
  "B"=list(
    "Category"="B",
    "City"=c("Q")
  )
)

There can be multiple items in the list. I have only three in this example: Category, Team and City. Category will always have only one unique value without duplication while the other fields can be one, many or missing.

I need one unique Category per row. Any other column that has multiple values for a category must be come a nested list.

This is the output I am looking for:

  Category Team      City 
  <chr>    <list>    <chr>
1 A        <chr [1]> N    
2 C        <chr [3]> V    
3 F        <chr [1]> NA   
4 K        <chr [3]> T    
5 B        <chr [1]> Q  

This was manually created by

library(purrr)
library(dplyr)

d <- map_dfr(lst,~as_tibble(.x))
reframe(d,across(Team,list),.by=Category) |>
  left_join(distinct(select(d,-Team),Category,.keep_all=TRUE), by="Category")

I am looking for a solution that would work for any number of fields/columns and any number of items in any column (except Category which will have only one item).

One of the many failed attempts:

map_dfr(lst,~as_tibble(.x)) |>
  group_by(Category) |> 
  summarise(
    across(everything(), ~ {
      if (n_distinct(.) == 1) {
        first(.)
      } else {
        list(.)
      }
    })
  )

Error in `summarise()`:
ℹ In argument: `across(...)`.
Caused by error:
! `Team` must return compatible vectors across groups.
ℹ Result of type <character> for group A: `Category = "A"`.
ℹ Result of type <list> for group C: `Category = "C"`.
Run `rlang::last_trace()` to see where the error occurred.

Session

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Ubuntu 22.04.4 LTS
dplyr_1.1.4
tidyr_1.3.1
purrr_1.0.2

5

Answers


  1. Maybe this solve your problem

    lst %>% 
      dplyr::bind_rows() %>% 
      tidyr::nest(Team = Team)
    
    # A tibble: 5 × 3
      Category City  Team            
      <chr>    <chr> <list>          
    1 A        N     <tibble [1 × 1]>
    2 C        V     <tibble [3 × 1]>
    3 F        NA    <tibble [1 × 1]>
    4 K        T     <tibble [3 × 1]>
    5 B        Q     <tibble [1 × 1]>
    
    Login or Signup to reply.
  2. Perhaps something like:

    list_transpose(lst, default = NA, template = unique(unlist(map(lst, names)))) |> 
      as_tibble() |> 
      mutate(across(where(is.list), (l) if (all(lengths(l) == 1)) unlist(l) else l))
    
    # A tibble: 5 × 3
      Category Team         City 
      <chr>    <named list> <chr>
    1 A        <chr [1]>    N    
    2 C        <chr [3]>    V    
    3 F        <chr [1]>    NA   
    4 K        <chr [3]>    T    
    5 B        <lgl [1]>    Q
    
    Login or Signup to reply.
  3. I hope this might work out well. tested on a new element with a new column.

    lst <- list(
      .... PREVIOUS lst LIST ....
    
      my_addition = list(
        Category = "Z",
        Team = state.name[1:2],
        City = letters[1:3],
        Zip = state.abb[1:4]
        
      )
    )
    
    lst_to_df <- function(lst_item) {
      tibble(
        Category = lst_item$Category,
        !!!map(lst_item[-1], list)
      )
    }
    
    map_dfr(lst, lst_to_df, .id = "List_Name") # remove `.id` to change from `my_addition` to `Z` as you probably really want
    
    Category Team      City      Zip      
      <chr>    <list>    <list>    <list>   
    1 A        <chr [1]> <chr [1]> <NULL>   
    2 C        <chr [3]> <chr [1]> <NULL>   
    3 F        <chr [1]> <NULL>    <NULL>   
    4 K        <chr [3]> <chr [1]> <NULL>   
    5 B        <NULL>    <chr [1]> <NULL>   
    6 Z        <chr [2]> <chr [3]> <chr [4]>
    > 
    
    Login or Signup to reply.
  4. Here is another solution that works for any number of fields/columns and any number of items in any column. Since you wish to mantain a single row by Category level, each column correspond to a list with the corresponding distinct values.

    library(rlang)
    
    # Bind lists as tibble
    df <- lst |>
      bind_rows() 
    
    # Get the column name used as key
    key <- colnames(df)[1]
    
    # Map over all columns and nest distinct values
    map(colnames(df)[-1], function(x){
      df |>
        select(all_of(c(key, x))) |>
        group_by(!!sym(key)) |>
        distinct(!!sym(x)) |>
        nest(!!sym(x):= x)
    }) |>
      # Reduce the lists using left_join
      reduce(left_join, by = key) 
    
    # A tibble: 5 × 3
    # Groups:   Category [5]
    #  Category Team             City            
    #  <chr>    <list>           <list>          
    #1 A        <tibble [1 × 1]> <tibble [1 × 1]>
    #2 C        <tibble [3 × 1]> <tibble [1 × 1]>
    #3 F        <tibble [1 × 1]> <tibble [1 × 1]>
    #4 K        <tibble [3 × 1]> <tibble [1 × 1]>
    #5 B        <tibble [1 × 1]> <tibble [1 × 1]>
    
    Login or Signup to reply.
  5. An approach using unnest_wider

    library(dplyr)
    library(tidyr)
    
    tibble(lst) %>% unnest_wider(lst)
    # A tibble: 5 × 3
      Category Team      City 
      <chr>    <list>    <chr>
    1 A        <chr [1]> N    
    2 C        <chr [3]> V    
    3 F        <chr [1]> NA   
    4 K        <chr [3]> T    
    5 B        <NULL>    Q
    

    Note that the non-existing Team is kept as NULL.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search