I want to use a GPU inside a Visual Studio Code docker container to train model with TensorFlow. To build an image for my container I use next Dockerfile:
FROM mcr.microsoft.com/vscode/devcontainers/anaconda:0-3
ARG PROJECT_NAME=fire_rec
ARG NODE_VERSION="none"
RUN if [ "${NODE_VERSION}" != "none" ]; then su vscode -c "umask 0002 && . /usr/local/share/nvm/nvm.sh && nvm install ${NODE_VERSION} 2>&1"; fi
COPY environment.yml* .devcontainer/noop.txt /tmp/conda-tmp/
RUN if [ -f "/tmp/conda-tmp/environment.yml" ]; then umask 0002 && /opt/conda/bin/conda env update -n base -f /tmp/conda-tmp/environment.yml; fi
&& rm -rf /tmp/conda-tmp
WORKDIR /srv/${PROJECT_NAME}
COPY requirements.txt /srv/${PROJECT_NAME}
RUN apt-get update && apt-get install -y python3-opencv
RUN apt-get update && apt-get install -y pip
RUN python3 -m pip install --no-cache -r requirements.txt
RUN apt-get update && apt-get install -y nvidia-cuda-toolkit
"requirements.txt" consists of:
opencv-python
tensorflow-gpu
numpy
matplotlib
albumentations
tensorflow_addons
Also I have .devcontainer.json file:
{
"name": "Anaconda (Python 3)",
"build": {
"context": "..",
"dockerfile": "Dockerfile",
"args": {
"NODE_VERSION": "none"
}
},
"settings": {
"python.defaultInterpreterPath": "/opt/conda/bin/python",
"python.linting.enabled": true,
"python.linting.pylintEnabled": true,
"python.formatting.autopep8Path": "/opt/conda/bin/autopep8",
"python.formatting.yapfPath": "/opt/conda/bin/yapf",
"python.linting.flake8Path": "/opt/conda/bin/flake8",
"python.linting.pycodestylePath": "/opt/conda/bin/pycodestyle",
"python.linting.pydocstylePath": "/opt/conda/bin/pydocstyle",
"python.linting.pylintPath": "/opt/conda/bin/pylint"
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance"
],
"remoteUser": "vscode",
}
I successfully built the image and launched the container. But when I try to launch this code in jupyter-notebook inside the container:
import tensorflow as tf
tf.config.list_physical_devices('GPU')
I get next messages:
2022-05-05 14:42:02.712454: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-05-05 14:42:02.712483: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
So this code fails to use GPU. How can I fix this problem?
4
Answers
Prerequisite:
The machine has GPU graphics card, and GPU graphics card driver is installed;
Installation environment of GPU, CUDA, etc;
Open PM attribute in NVIDIA-SMI;
GPU equipment specified in the program;
Run the python program in the terminal and use the command:
CUDA_VISIBLE_DEVICES=0 python filename.py
make sure you have NVIDIA Container Toolkit installed. then add this to your .devcontainer.json:
check this to see how you can add more options to your .devcontainer.json
Having the same problem, I tried many options of specifying "gpus" in "runArgs" (through ID, exact name of GPU) and none works. On the other hand when I run container by hand everything works. For me it looks like some bug in vsc :/
Just in case, I made issue on github: https://github.com/microsoft/vscode-remote-release/issues/6989
Please change your code to your
devcontainer.json