I’ve created a docker image using the following Dockerfile
:
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /usr/local/src
# Setting up general environment
RUN apt-get -y update
&& apt-get install -y build-essential
&& apt-get install -y wget
&& apt-get install -y hmmer
&& apt-get install -y git
&& apt-get clean
&& rm -rf /var/lib/apt/lists/*
## Installing miniconda
ENV CONDA_DIR /opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh &&
/bin/bash ~/miniconda.sh -b -p /opt/conda
ENV PATH=$CONDA_DIR/bin:$PATH
# Installing NLRTracker
RUN git clone https://github.com/eliza-m/NLRexpress
WORKDIR /usr/local/src/NLRexpress
# Setting up the conda environment and required variables
RUN conda env create -f environment.yml &&
conda init bash &&
echo "conda activate nlrexpress" >> ~/.bashrc
ENV PATH /opt/conda/envs/nlrexpress/bin:/usr/local/src/NLRexpress:$PATH
ENV CONDA_DEFAULT_ENV nlrexpress
RUN wget https://nlrexpress.biochim.ro/datasets/models.tar.gz &&
tar -xf models.tar.gz &&
rm models.tar.gz
RUN echo "#!/bin/bash n python nlrexpress.py" > nlrexpress &&
chmod +x nlrexpress
I avoided using the CMD
argument and made an executable nlrexpress
because I want to use this image for hmmsearch
too.
The image builds fine, and when i tested docker run nlrexpress:latest nlrexpress
I get the expected output:
Usage: nlrexpress.py [OPTIONS]
Try 'nlrexpress.py --help' for help.
Error: Missing option '--input'.
However, when I use the container with nextflow I get the following error: python: can't open file '/path/to/workDir/ea/72bd9e660d0ce79944d8bdde3dd024/nlrexpress.py': [Errno 2] No such file or directory
Here is the nextflow process:
process NLRexpress {
tag "$sample_id"
publishDir params.PlantDir
maxForks 1
container = 'dthorbur1990/nlrexpress:latest'
executor = "local"
input:
tuple val(sample_id), path(peptides)
output:
path "*.short.output.txt"
script:
"""
mkdir output
nlrexpress \
--input ../${peptides} \
--outdir ./output \
--module ${params.NE_Modules}
mv output/*.short.output.txt ./
"""
}
How can i ensure files in the containers WORKDIR
are available when mounting a container? I’ve tried setting ENV
variables, but this doesn’t seem to work either. I thought because WORKDIR
is set, that the image would always mount to the WORKDIR
path and all the files would be available.
I’ve found I can just clone the repo into the nextflow working directory, but this isn’t an ideal workaround as I would also have to download the models for each process. The same issue goes for the models
directory i downloaded into the container.
**Edit: Just adding that hmmsearch
works absolutely fine with nextflow and the container.
2
Answers
I'm still getting to grips with how docker works, but I found a solution in case anyone else has the same porblems.
First I tried to give the executable
nlrexpress
the full path to the python script:But this ended up executing the command and ignoring the input parameters that followed.
Instead, I indicated which binary was needed via the shebang so the python script itself could be executed without needed to write
python script.py
.Creating your own wrapper script, like in your first example, is generally considered a cleaner and more flexible solution. Here’s one way using the continuumio/miniconda3 image: