skip to Main Content

I installed the nvidia-docker2 following the instructions here. When running the following command I will get the expected output as shown.

sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:0B:00.0  On |                  N/A |
| 24%   31C    P8    13W / 250W |    222MiB / 11011MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                           
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

However, running the above command without "sudo" results in the following error for me:

$ docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create 
failed: runc create failed: unable to start container process: error during container 
init: error running hook #0: error running hook: exit status 1, stdout: , stderr: 
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: 
cannot open shared object file: no such file or directory: unknown.

Can anyone please help me with how I can solve this problem?

2

Answers


  1. Add docker group to your user:

    sudo usermod -aG docker your_user
    

    Update:

    Check here https://github.com/NVIDIA/nvidia-docker/issues/539

    Maybe something from the comments will help you.

    Login or Signup to reply.
  2. try adding "sudo" to you docker command.
    e.g sudo docker-compose …

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search