How do I set cuDNN error CUDNN_STATUS_NOT_INITIALISED on WSL2 using Torch on Ubuntu?

Amadou
May 15, 2023
133 views
0 votes
2 Answers

I am trying to test deep neural networks with Torch==2.0.0cu+117.

I use WSL2 Ubuntu on Windows 10
I have NVIDIA T1200 Laptop GPU and I have installed a compatible version of driver on Windows
I have installed WSL as administrator using wsl --install and wsl --update and some required libraries
On WSL2, When I tap nvidia-smi, I get this
driver NVIDIA Info
I have installed Cudatoolkit version 12.1 with local repo (.deb) and have used export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}} and
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
I have also installed CuDNN version 8.9.1.23
And then I Install Anaconda (Anaconda3-2020.02-Linux-x86_64) and create an conda environment with conda create -n my_env python=3.8 cython. I have install torch version 2.0.0cu+117 and other python required libraries.

When I compile my script on my environment, I get this error message :

Traceback (most recent call last):`
  File "test.py", line 169, in <module>
    tester.infer()
  File "test.py", line 103, in infer
    pred = self.model(pts.cuda(), None)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/networks.py", line 47, in forward
    _, y0 = self.cv_in(x_in, x_in, y_in, 16)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/kernels.py", line 253, in forward
    y_out = self.conv(y_out.transpose(2,1)).view(batch_size, n_pts_out, self.out_features)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

I have tried to reinstall WSL, CuDNN and recreated the conda environment, but the problem still persists.

Answers

Chosen as BEST ANSWER
- Amadou
- May 11, 2023 at 9:40 am
- 0 votes
0
I solved the problem by installing a new NVIDIA driver, the Cudatoolkit and cuDNN on Windows.

(Edit)

I solved the problem by installing a new NVIDIA driver, the Cudatoolkit and cuDNN on Windows.

Could you please share more details? I have a similar case, this is my driver screenshot.After running nvidia-smi

Error output:

$ python3 demo.py --model human-pose-

estimation3d.pth --video /mnt/c/Users/Damian/My Documents/OneDrive - Aalborg Universitet/2_MASTER_VGIS/10_Semester_Master_thesis/ThesisVideos/2023-02-22-people/20230222_100905.mp4 -d GPU
#### Cannot load fast pose extraction, switched to legacy slow implementation. ####
Traceback (most recent call last):
  File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/demo.py", line 98, in <module>
    inference_result = net.infer(scaled_img)
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/modules/inference_engine_pytorch.py", line 37, in infer
    features, heatmaps, pafs = self.net(data)
                               ^^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/models/with_mobilenet.py", line 180, in forward
    model_features = self.model(x)
                     ^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Please signup or login to give your own answer.

Click here to cancel reply.