skip to Main Content

I would like to export content from cells in a dataframe into separate html files. I would also like to name the files according to the corresponding cell in another column, but cannot make it work.

I am using a for loop to export each cell into an html file, and it works, but I cannot make it grab the appropriate filename from another cell.

Example dataframe:

names   <- c("alpha", "beta", "gamma", "delta", "epsilon")
content <- c(paste("<div>Example", 1:5, "</div>"))
df <- data.frame(names, content)

This is my for loop, but I need to grab the corresponding data from df$names:

for (x in df$content) {
 filename <-  # the corresponding row in df$names + ".html"
 writeLines(x, filename)
} 

The expectable outcome is to end up with five html files with the corresponding content in the working folder named:

alpha.thml,
beta.html,
etc.

2

Answers


  1. Chosen as BEST ANSWER

    I found out that which() can help me find the corresponding index number. I post my self-answer.

    names   <- c("alpha", "beta", "gamma", "delta", "epsilon")
    content <- c(paste("<div>Example", 1:5, "</div>"))
    df <- data.frame(names, content)
    lst <- list(names, content)
    
    for (x in df$content) {
     filename <- paste(df$names[which(df$content==x)], ".html")
     writeLines(x, as.character(filename))
    } 
    

  2. With a for-loop you are better off looping over indices and use those for subsetting:

    names   <- c("alpha", "beta", "gamma", "delta", "epsilon")
    content <- c(paste("<div>Example", 1:5, "</div>"))
    df <- data.frame(names, content)
    
    for (idx in seq_along(df$content)) {
      filename <-  paste0(df$names[idx],"_for.html")
      writeLines(df$content[idx], filename)
    } 
    

    There are also *apply, purrr::*map and purrr::*walk() families, this particular case is well suited for pwalk(), iterate over multiple lists in parallel (columns in data.frame) for a side effect (writing files):

    purrr::pwalk(df, (names, content) writeLines(content, paste0(names,"_pwalk.html")))
    

    Resulting files:

    fs::dir_info(glob = "*for.html")[,1:3]
    #> # A tibble: 5 × 3
    #>   path             type         size
    #>   <fs::path>       <fct> <fs::bytes>
    #> 1 alpha_for.html   file           23
    #> 2 beta_for.html    file           23
    #> 3 delta_for.html   file           23
    #> 4 epsilon_for.html file           23
    #> 5 gamma_for.html   file           23
    
    fs::dir_info(glob = "*pwalk.html")[,1:3]
    #> # A tibble: 5 × 3
    #>   path               type         size
    #>   <fs::path>         <fct> <fs::bytes>
    #> 1 alpha_pwalk.html   file           23
    #> 2 beta_pwalk.html    file           23
    #> 3 delta_pwalk.html   file           23
    #> 4 epsilon_pwalk.html file           23
    #> 5 gamma_pwalk.html   file           23
    

    Created on 2024-06-28 with reprex v2.1.0

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search