skip to Main Content

I am trying to reduce the size of my RMarkdown html report and more importantly, make them faster to open. The html report consists of a large number of R Plotly plots with each plot containing a large number of data points (1000+). Considering that R Plotly stores all of the raw data for each plot within the html file, I believed a good option to reduce the file size was to round the decimal places in the data. However, I found that even though the input data was rounded, R Plotly still maintains a large number of decimals places in the html file. Consequently the file size is not reduced if data is rounded.

See below for 2 cases, the base containing raw data, and the rounding case containing rounded data. The file size is the same for both cases.

Base Case HTML

RawData <- data.frame(Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
                      PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411,0.99476560, 0.42941527))
RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
  add_trace(x = ~Date, y = ~PreciseValue, name = 'PreciseValue')
saveWidget(fig, "plotly_base.html", selfcontained = TRUE)

The html file size is 3780kb.
If I open the html file and look at the underlying R Plotly data, the stored y data is:

"y":[0.15162704353000001,0.35426295622999998,0.83393426323999997,0.57968136341999998,0.39334726234,0.29371352347000002,0.17792423404999999,0.44352285533000002,0.68418423485000002,0.36623994110000002,0.99476432455999997,0.42941523452699998]

Notice that there are more decimals places than in the original data.

Rounding Values Case

RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
  add_trace(x = ~Date, y = ~RoundValue, name = 'RoundValue')

saveWidget(fig, "plotly_round.html", selfcontained = TRUE)

The html file size for the round case is also 3780kb.
The underlying data for this case is

"y":[0.14999999999999999,0.34999999999999998,0.82999999999999996,0.57999999999999996,0.39000000000000001,0.28999999999999998,0.17999999999999999,0.44,0.68000000000000005,0.37,0.98999999999999999,0.42999999999999999]

The stored y data should be something like

"y":[0.15, 0.35, 0.83, 0.58, 0.39, 0.29, 0.18, 0.44, 0.68, 0.37, 0.99, 0.43]

Does anyone know how to configure R Plotly to only store the configured number of decimal places in html output?

2

Answers


  1. You can do this

    library(plotly)
    library(htmlwidgets)
    
    RawData <- data.frame(
      Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
      PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411, 0.99476560, 0.42941527)
    )
    
    fig <- plot_ly(RawData, type = 'scatter', mode = 'lines') %>%
      add_trace(x = ~Date, y = ~PreciseValue, name = 'PreciseValue')
    
    saveWidget(fig, "plotly_base.html", selfcontained = TRUE)
    
    html_content <- readLines("plotly_base.html")
    html_content <- gsub(
      pattern = "([0-9]+\.[0-9]{2})[0-9]*", 
      replacement = "\1", 
      x = html_content
    )
    
    writeLines(html_content, "plotly_rounded.html")
    

    which gives

    [1]: https://i.sstatic.net/IYQ9sxpW.png

    and in the html you’ll have

    <script type="application/json" data-for="htmlwidget-f7b8f28dbb6776fbb0e5">{"x":{"visdat":{"60443a954e78":["function () ","plotlyVisDat"]},"cur_data":"60443a954e78","attrs":{"60443a954e78":{"mode":"lines","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter"},"60443a954e78.1":{"mode":"lines","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter","x":{},"y":["0.15","0.35","0.83","0.58","0.39","0.29","0.18","0.43","0.68","0.34","0.99","0.43"],"name":"RoundValue","inherit":true}},"layout":{"margin":{"b":40,"l":60,"t":25,"r":10},"xaxis":{"domain":[0,1],"automargin":true,"title":"Date"},"yaxis":{"domain":[0,1],"automargin":true,"title":[]},"hovermode":"closest","showlegend":true},"source":"A","config":{"modeBarButtonsToAdd":["hoverclosest","hovercompare"],"showSendToCloud":false},"data":[{"mode":"lines","type":"scatter","marker":{"color":"rgba(31,119,180,1)","line":{"color":"rgba(31,119,180,1)"}},"error_y":{"color":"rgba(31,119,180,1)"},"error_x":{"color":"rgba(31,119,180,1)"},"line":{"color":"rgba(31,119,180,1)"},"xaxis":"x","yaxis":"y","frame":null},{"mode":"lines","type":"scatter","x":["2024-01-01","2024-02-01","2024-03-01","2024-04-01","2024-05-01","2024-06-01","2024-07-01","2024-08-01","2024-09-01","2024-10-01","2024-11-01","2024-12-01"],"y":["0.15","0.35","0.83","0.58","0.39","0.29","0.18","0.43","0.68","0.34","0.99","0.43"],"name":"RoundValue","marker":{"color":"rgba(255,127,14,1)","line":{"color":"rgba(255,127,14,1)"}},"error_y":{"color":"rgba(255,127,14,1)"},"error_x":{"color":"rgba(255,127,14,1)"},"line":{"color":"rgba(255,127,14,1)"},"xaxis":"x","yaxis":"y","frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.20000000000000001,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}</script>
    
    
    Login or Signup to reply.
  2. When you make a plot using a data.frame, plotly saves the underlying data.frame in the browser, so it doesn’t matter which columns you use to plot if you don’t remove them from the underlying data.frame.

    In the example below:

    • For new plot – notice how if you remove the PreciseValue column, it is not passed to your figure.
    • Also consider if you are passing a lot of other columns in your real dataset, you might want to remove them.
    library(dplyr)
    library(plotly)
    
    
    # your fig ----------------------------------------------------------------
    RawData <- data.frame(Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
                          PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411,0.99476560, 0.42941527))
    RawData$RoundValue <- round(RawData$PreciseValue,2)
    fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
      add_trace(x = ~Date, y = ~PreciseValue, name = 'PreciseValue')
    
    plotly::plotly_data(fig)
    #> # A tibble: 12 × 3
    #>    Date       PreciseValue RoundValue
    #>    <date>            <dbl>      <dbl>
    #>  1 2024-01-01        0.152       0.15
    #>  2 2024-02-01        0.354       0.35
    #>  3 2024-03-01        0.834       0.83
    #>  4 2024-04-01        0.580       0.58
    #>  5 2024-05-01        0.393       0.39
    #>  6 2024-06-01        0.294       0.29
    #>  7 2024-07-01        0.178       0.18
    #>  8 2024-08-01        0.429       0.43
    #>  9 2024-09-01        0.684       0.68
    #> 10 2024-10-01        0.340       0.34
    #> 11 2024-11-01        0.995       0.99
    #> 12 2024-12-01        0.429       0.43
    
    
    # new fig -----------------------------------------------------------------
    RawData <- data.frame(
      Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
      PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411,0.99476560, 0.42941527)
      ) %>% 
      mutate(
        RoundValue = round(PreciseValue, 2)
      ) %>% 
      select(
        Date, RoundValue
      )
    
    
    fig2 <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
      add_trace(x = ~Date, y = ~RoundValue, name = 'RoundValue')
    
    
    plotly::plotly_data(fig2)
    #> # A tibble: 12 × 2
    #>    Date       RoundValue
    #>    <date>          <dbl>
    #>  1 2024-01-01       0.15
    #>  2 2024-02-01       0.35
    #>  3 2024-03-01       0.83
    #>  4 2024-04-01       0.58
    #>  5 2024-05-01       0.39
    #>  6 2024-06-01       0.29
    #>  7 2024-07-01       0.18
    #>  8 2024-08-01       0.43
    #>  9 2024-09-01       0.68
    #> 10 2024-10-01       0.34
    #> 11 2024-11-01       0.99
    #> 12 2024-12-01       0.43
    

    Created on 2024-05-20 with reprex v2.1.0

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search