skip to Main Content

I was trying to build a docker container using Apptainer. I have a long list of R packages to be installed with specific versions. I need to install it from CRAN/BioConductor directly and not from conda. Now, for each package I try to install, I might hit a roadblock due to missing libraries and then have to go back and add it using apt install. This becomes an iterative process and takes a long time. Is there an interactive way to build the container. I was trying using sandbox and fakeroot but it does not get the full capabilities to write while installing the system libraries. What other solution can I integrate?

Bootstrap: docker
From: ubuntu:22.04
%labels
    Author author_name
    version v0.1

%post 
    export DEBIAN_FRONTEND=noninteractive
    
    # Update system and install requirements
    apt-get -qq update
    

    apt -y install build-essential gfortran curl cmake pkg-config zip unzip 
                   libglib2.0-0 libpango-1.0-0 libpangocairo-1.0-0 libpaper-utils 
                   libtcl8.6 libtk8.6 libxt6 lmod 
    
    apt -y install libbz2-dev libicu-dev liblzma-dev libopenblas-dev libpq-dev 
                   libpcre2-dev zlib1g-dev libssl-dev libudunits2-dev libncurses-dev libcairo2-dev 
                   libsodium-dev libmariadb-dev libcurl4-openssl-dev libssh2-1-dev libxml2-dev 
                   libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev libfontconfig1-dev 
                   libharfbuzz-dev libfribidi-dev liblapack-dev libgit2-dev libmagick++-dev 
    
    
    curl -o /tmp/libicu60_60.2-3ubuntu3.2_amd64.deb http://security.ubuntu.com/ubuntu/pool/main/i/icu/libicu60_60.2-3ubuntu3.2_amd64.deb
    dpkg -i /tmp/libicu60_60.2-3ubuntu3.2_amd64.deb

  
    # Install R with all recommended 
    export R_VERSION=4.2.2
    curl -o /tmp/r-${R_VERSION}_1_amd64.deb https://cdn.rstudio.com/r/ubuntu-2204/pkgs/r-${R_VERSION}_1_amd64.deb
    dpkg -i /tmp/r-${R_VERSION}_1_amd64.deb

    ln -s /opt/R/${R_VERSION}/bin/R /usr/local/bin/R
    ln -s /opt/R/${R_VERSION}/bin/Rscript /usr/local/bin/Rscript


    # Install the R packages
    
    R -e 'install.packages("remotes", dependencies = TRUE, repos = c("https://cloud.r-project.org/", "http://rforge.net/"))'
    R -e 'remotes::install_version(package = "BiocManager", version = "1.30.22", dependencies = TRUE, repos = c("https://cloud.r-project.org/", "http://rforge.net/"), upgrade = "never")'

    R -e 'remotes::install_version(package = "dplyr", version = "1.1.4", dependencies = TRUE, repos = c("https://cloud.r-project.org/", "http://rforge.net/"), upgrade = "never")' 
    R -e 'remotes::install_version(package = "ggplot2", version = "3.5.1", dependencies = TRUE, repos = c("https://cloud.r-project.org/", "http://rforge.net/"), upgrade = "never")'  ##### downgraded to 3.4.3
    R -e 'remotes::install_version(package = "ggpubr", version = "0.6.0", dependencies = TRUE, repos = c("https://cloud.r-project.org/", "http://rforge.net/"), upgrade = "never")' 

2

Answers


  1. As you’re using Ubuntu, an alternative approach would be to use the Ubuntu binary versions of CRAN packages provided by r2u. This has several advantages, as set out in the docs:

    • Full integration with apt as every binary resolves all its dependencies: No more
      installations (of pre-built archives) only to discover that a shared library is missing. No more
      surprises.
    • Full integration with apt so that an update of a system library cannot break an R package:
      if a (shared) library is used by a CRAN, the package manager knows, and will not remove it. No
      more (R package) breakage from (system) library updates.
    • Simpler and lighter than some alternatives as only run-time library packages are installed as
      dependencies (instead of generally heavier development packages).
    • Installations are fast, automated and reversible thanks to the package management layer.

    This can be done with three files:

    1. Dockerfile.
    2. install.r.
    3. pkgs_to_install.txt

    Dockerfile

    This looks like:

    FROM eddelbuettel/r2u:noble
    WORKDIR /r
    COPY install.r .
    COPY pkgs_to_install.txt .
    ENV TZ=Europe/London
    RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
    RUN Rscript install.r
    

    You may wish to set the time zone to your local area.

    install.r

    pkgs_to_install <- readLines("./pkgs_to_install.txt")
    install.packages("remotes")
    remotes::install_github("REditorSupport/languageserver") # to prevent vscode warnings
    Sys.setenv(R_INSTALL_STAGED = FALSE) # prevents occasional 00LOCK permissions issues
    
    for (pkg in pkgs_to_install) {
        try(install.packages(pkg))
    }
    

    pkgs_to_install.txt

    In your case this could be:

    dplyr
    ggplot2
    ggpubr
    

    My real setup is slightly more complex. I have an additional R script which runs at the end which makes a few small changes to the R profile and also writes out the output of installed.packages() to a csv, so it is clear which version of each package is installed in the container.

    The overall point, though, is that it is easy to add whatever features you like with this workflow. It should also avoid the dependency nightmares that seem inevitable with the various C++ libraries that certain packages require. Also, as all the packages are installed from pre-compiled binaries, rather than source, it is much faster to build.

    Login or Signup to reply.
  2. In my experience, the iterative practice of installing R / Py packages, revealing missing system packages, installing the system package, and trying again is unavoidable!

    I agree it’s easier to develop and test interactively. This is possible by a docker run --rm -ti [image name] bash to get a bash shell interface into the container from the built image.

    I believe it’s also possible with docker exec if the container is already running with a different CMD statement.

    Then, you can launch the R console with $ R and then try the > install.packages(). And if it complains about a system package, > quit() and do the $ apt get install ... or whatever appropriate for your OS. And then try again.

    Finally, add all package install commands you found were necessary to your Dockerfile.

    docker build -t my_image .
    docker run --rm -ti my_image bash
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search