skip to Main Content

I extracted the JSON from the following page:

library(jsonlite)
results <-  fromJSON("https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json")
final = results$data

When I inspect the output, I can see that even though that the output is in a "list" format, there appears to be a "tabular data frame" structure within the output:

t3, NA, gardening, , FALSE, NA, 0, FALSE, Tree surgeon butchered my tree - will it be ok?, r/gardening, FALSE, 6, NA, 0, 140, NA, all_ads, FALSE, t3_1196op

My Question: Based on the above – is it possible to somehow convert this output into a data frame?

I tried the following code:

dataframe_list = as.data.frame(final)

The code ran – but the output is still not in a tabular/data frame output.

In the end, I would like to have the result in the following format:

  comment_id                      comment_text
1          1                 I like gardening!
2          2            I dont like to garden!
3          3             its too cold outside?
4          4 try planting something different?
5          5                    garden is fun!

Can someone please show me how to do this?

Thanks!

Note: If you look at the actual website https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json – the desired text appears to be between the tags "body:" and "edited" :

enter image description here

Maybe I am approaching this problem the wrong way and there is a better way of doing this?

2

Answers


  1. Here is one approach using pluck(), bind_rows() and unnest():

    library(jsonlite)
    library(purrr)
    library(dplyr)
    library(tidyr)
    
    URL <- "https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/.json"
    
    fromJSON(URL) |>
      pluck("data", "children") |> # .$data$children
      bind_rows() |>
      filter(row_number() > 1) |>
      unnest(data) |>
      select(id, author, body) |>
      mutate(comment_id = row_number(), .before = "id")
    

    Output:

    # A tibble: 75 × 4
       comment_id id      author         body                                                                                             
            <int> <chr>   <chr>          <chr>                                                                                            
     1          1 j9ktvi3 mikpgod        "It'll grow back, probably won't be able to tell by summer. Except it'll be smaller"             
     2          2 j9l0egd hrudnick       "Saw a tree surgeons advert today. Said, "Don't worry, I hug them first.""                     
     3          3 j9kyb1v anonnewengland "It will be covered in new growth in a few months."                                              
     4          4 j9kqqqk Beatnikdan     "He must've been a civil war surgeon. nnThey should survive but get a different tree guy to cl…
     5          5 j9n0kp8 Live-Steaky    "Very few people in there comment section actually know what’s up. It’s a fine pruning job, extr…
     6          6 j9l2gxf Luke_low       "Speaking of Tree Butchery, My parents have hired an "amateur landscaper guy" a bunch of times…
     7          7 j9npnl1 tomt6371       "In all honesty it looks good,and definitely could have been pollarded further, it's the right s…
     8          8 j9kpkux Amezrou        "Had a tree surgeon round today to take the height off my Hazel and Plum trees and he’s absolute…
     9          9 j9kxjyz testhec10ck    "Those cut angles all look good. This seems pretty standard for an early spring pruning"         
    10         10 j9laq63 MarieTC        "Lots of new growth will come and the tree will be fuller"                                       
    # … with 65 more rows
    
    Login or Signup to reply.
  2. For parsing JSON from Reddit you may want to check RedditExtractoR package, get_thread_content() returns list of 2 data.frames, one for thread and another for comments:

    library(dplyr)
    thread <- RedditExtractoR::get_thread_content("https://www.reddit.com/r/gardening/comments/1196opl/tree_surgeon_butchered_my_tree_will_it_be_ok/")
    
    thread$threads %>% 
      select(author, title, text) %>% 
      as_tibble()
    #> # A tibble: 1 × 3
    #>   author  title                                           text 
    #>   <chr>   <chr>                                           <chr>
    #> 1 Amezrou Tree surgeon butchered my tree - will it be ok? ""
    
    thread$comments %>% 
      select(comment_id, author, comment) %>% 
      as_tibble()
    #> # A tibble: 176 × 3
    #>    comment_id  author           comment                                         
    #>    <chr>       <chr>            <chr>                                           
    #>  1 1           mikpgod          "It'll grow back, probably won't be able to tel…
    #>  2 1_1         Amezrou          "I really hope so&"                             
    #>  3 1_1_1       mikpgod          "Hazel's difficult to kill."                    
    #>  4 1_1_1_1     symetry_myass    "&gt; Hazel's difficult to kill.nnI see an Um…
    #>  5 1_1_1_2     Amezrou          "Yeah but whatu0019s it going to look like whe…
    #>  6 1_1_1_2_1   EpidonoTheFool   "Not very good in my opinion a lot of weak grow…
    #>  7 1_1_1_2_2   Cold-Pack-7653   "It will eventually look normal but its going t…
    #>  8 1_1_1_2_3   lethal_moustache "Look for images of coppiced trees. Yours will …
    #>  9 1_1_1_2_3_1 LeGrandePoobah   "This is an interesting article. Iu0019m not s…
    #> 10 1_1_1_2_3_2 treecarefanatic  "this is pollarding not coppicing"              
    #> # … with 166 more rows
    

    Created on 2023-02-23 with reprex v2.0.2

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search