skip to Main Content

I have a docker image that’s primarily set up to have a GitHub Actions pipeline run on it doing a bunch of stuff in R, that I also want to be able to use to run some Julia scripts that are pretty simple. The image builds totally fine and all the R components work well, but for some reason, even though Julia appears to install, I can’t get the packages themselves to be available?

Here’s the Dockerfile:

    FROM ubuntu:jammy
    
    # all the R install stuff 
    ENV R_VERSION=4.2.3
    ENV R_HOME=/usr/local/lib/R
    ENV TZ=Etc/UTC
    
    COPY scripts/install_R_source.sh /temp_scripts/install_R_source.sh
    
    RUN /temp_scripts/install_R_source.sh
    ENV CRAN=https://packagemanager.posit.co/cran/__linux__/jammy/2023-04-20
    ENV LANG=en_US.UTF-8
    
    # set up R 
    COPY scripts/setup_R.sh /temp_scripts/setup_R.sh
    RUN /temp_scripts/setup_R.sh
    
    # do the R things we want
    RUN install2.r devtools remotes
    RUN R -e "Sys.setenv("NOT_CRAN" = TRUE); Sys.setenv("LIBARROW_MINIMAL" = FALSE); Sys.setenv("LIBARROW_BINARY" = FALSE)"
    RUN R -e "devtools::install_github('ropensci/rglobi')"
    
    # set up Julia
    COPY scripts/install_julia.sh /temp_scripts/install_julia.sh
    RUN /temp_scripts/install_julia.sh
    
    CMD ["julia", "R"]
    
    # do the julia things we want
    #RUN julia -e 'using Pkg; Pkg.activate("."); Pkg.instantiate()'
    RUN julia -e 'import Pkg; Pkg.activate("."); Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add(Pkg.PackageSpec(name="NCBITaxonomy", rev="main"))'
    #RUN julia --project -e 'import Pkg; Pkg.activate("."); Pkg.add(PackageSpec(name="NCBITaxonomy", rev="main"))'
    #RUN set -eux; 
    #    mkdir "$JULIA_USER_HOME";
    
    RUN julia -e 'using Pkg; Pkg.instantiate();'
    
    # note, R.utils is needed for datatable to work with csv.gz files
    RUN install2.r  --error --skipinstalled --ncpus -1 
     readr 
     taxize 
     magrittr 
     dplyr 
     tidyr 
     RCurl 
     vroom 
     fs 
     zip 
     devtools 
     lubridate 
     yaml 
     R.utils 
     here 
     data.table 
     JuliaCall 
     && rm -rf /tmp/downloaded_packages

The actions pipeline then runs fine up until it runs the first bit of Julia code, but then it can’t find the packages? The error is:


     ERROR: LoadError: ArgumentError: Package DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0] is required but does not seem to be installed:
     - Run `Pkg.instantiate()` to install all recorded dependencies.

I’m hoping this is just a stupid error on my part not setting an environment variable or something of that type?

The install_julia.sh is here, which I adapted from rocker’s version:


    #!/bin/bash
    set -e
    
    ## build ARGs
    NCPUS=${NCPUS:--1}
    
    #JULIA_VERSION=${1:-${JULIA_VERSION:-latest}}
    
    # a function to install apt packages only if they are not installed
    function apt_install() {
        if ! dpkg -s "$@" >/dev/null 2>&1; then
            if [ "$(find /var/lib/apt/lists/* | wc -l)" = "0" ]; then
                apt-get update
            fi
            apt-get install -y --no-install-recommends "$@"
        fi
    }
    
    ARCH_LONG=$(uname -p)
    ARCH_SHORT=$ARCH_LONG
    
    if [ "$ARCH_LONG" = "x86_64" ]; then
        ARCH_SHORT="x64"
    fi
    
    apt_install wget ca-certificates
    
    
    # Download Julia and create a symbolic link.
    wget "https://julialang-s3.julialang.org/bin/linux/x64/1.7/julia-1.7.3-linux-x86_64.tar.gz"
    mkdir /opt/julia
    tar zxvf "julia-1.7.3-linux-x86_64.tar.gz" -C /opt/julia --strip-components 1
    rm -f "julia-1.7.3-linux-x86_64.tar.gz"
    ln -s /opt/julia/bin/julia /usr/local/bin/julia
    
    julia --version
    
    # Clean up
    rm -rf /var/lib/apt/lists/*
    rm -rf /tmp/downloaded_packages
    
    ## Strip binary installed lybraries from RSPM
    ## https://github.com/rocker-org/rocker-versioned2/issues/340
    strip /usr/local/lib/R/site-library/*/libs/*.so

2

Answers


  1. You are mixing in this code whether you want to install the packages to the local or the global virtual environment.

    In those comments where you run Pkg.activate(.) you are setting the current folder to be the local virtual environment. Than you are installing the packages to the local environment.
    However your using Pkg; Pkg.instantiate(); is run against the global environment.

    Ideally you should place a Package.toml file in the folder where you plan your local environment to be and than run: using Pkg;Pkg.activate("."); Pkg.instantiate();.
    Later when you run programs in that environment you should also start them with Pkg.activate(".")

    Notes:

    • dot '.' means the current folder, you might want to consider providing full absolute path instead (depends on how your scripts are laid out)
    • the julia executable has the --project option to set the virtual environment
    • see also for more details https://pkgdocs.julialang.org/v1/environments/
    Login or Signup to reply.
  2. Cause:

    When you do

        RUN julia -e 'import Pkg; Pkg.activate("."); Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add(Pkg.PackageSpec(name="NCBITaxonomy", rev="main"))'
    

    the Pkg.activate creates a new virtual environment in the working directory, and so CSV, DataFrames, etc. get installed in that virtual environment. Your Action is likely using the global default environment, so it’s not able to find these packages.

    Solution:

    For a docker container used for Github Actions, the easiest solution is probably just to remove the Pkg.activate(".") line. Usually, virtual environments are encouraged in Julia since they’re so easy and cheap to create. But in a single-purpose dockerfile, the simplicity of doing everything in the global environment takes precedence. So I’d suggest replacing

        # do the julia things we want
        #RUN julia -e 'using Pkg; Pkg.activate("."); Pkg.instantiate()'
        RUN julia -e 'import Pkg; Pkg.activate("."); Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add(Pkg.PackageSpec(name="NCBITaxonomy", rev="main"))'
        #RUN julia --project -e 'import Pkg; Pkg.activate("."); Pkg.add(PackageSpec(name="NCBITaxonomy", rev="main"))'
        #RUN set -eux; 
        #    mkdir "$JULIA_USER_HOME";
        
        RUN julia -e 'using Pkg; Pkg.instantiate();'
    

    with just

        # do the julia things we want
    
        RUN julia -e 'using Pkg; Pkg.add("CSV"); Pkg.add("DataFrames"); Pkg.add(Pkg.PackageSpec(name="NCBITaxonomy", rev="main"))'
    
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search