skip to Main Content

I am unable to run GPU on Docker after updating the GPU driver.
When I run nvidia-smi in the host environment (Centos), the GPU is recognized.

docker run --gpus all -it -v $(pwd):/home/workspace test /bin/bash

ERRO[0000] error waiting for container: context canceled 
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

From what I have researched, it seems that updating the GPU driver removes the settings, so I referred to NVIDIA container toolkit site and performed the installation procedure again, but the above error is still there. I also rebooted the system just to be sure, but this did not solve the problem.
What should I do?

docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi
This also generated the above error as well.

2

Answers


  1. Supposing you followd well installation procedure:

    Now, from Tutorial: how to attach NVIDIA GPU with docker
    https://www.howtogeek.com/devops/how-to-use-an-nvidia-gpu-with-docker-containers/

    Use the nvidia base image instead of your image test

    docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi /bin/bash
    

    If it’s not running well, your image test needs further debugging

    Login or Signup to reply.
  2. What made the difference for me was to follow the instructions from Docker on how to access an nvidia gpu

    1. Follow the instructions to add nvidia-container-runtime to apt
    2. apt-get install nvidia-container-runtime
    3. Ensure the nvidia-container-runtime-hook is accessible from $PATH: which nvidia-container-runtime-hook
    4. Reboot
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search