skip to Main Content

I am trying to test deep neural networks with Torch==2.0.0cu+117.

  • I use WSL2 Ubuntu on Windows 10

  • I have NVIDIA T1200 Laptop GPU and I have installed a compatible version of driver on Windows

  • I have installed WSL as administrator using wsl --install and wsl --update and some required libraries

  • On WSL2, When I tap nvidia-smi, I get this
    driver NVIDIA Info

  • I have installed Cudatoolkit version 12.1 with local repo (.deb) and have used export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}} and
    export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

  • I have also installed CuDNN version 8.9.1.23

  • And then I Install Anaconda (Anaconda3-2020.02-Linux-x86_64) and create an conda environment with conda create -n my_env python=3.8 cython. I have install torch version 2.0.0cu+117 and other python required libraries.

When I compile my script on my environment, I get this error message :

Traceback (most recent call last):`
  File "test.py", line 169, in <module>
    tester.infer()
  File "test.py", line 103, in infer
    pred = self.model(pts.cuda(), None)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/networks.py", line 47, in forward
    _, y0 = self.cv_in(x_in, x_in, y_in, 16)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/kernels.py", line 253, in forward
    y_out = self.conv(y_out.transpose(2,1)).view(batch_size, n_pts_out, self.out_features)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

I have tried to reinstall WSL, CuDNN and recreated the conda environment, but the problem still persists.

2

Answers


  1. Chosen as BEST ANSWER

    I solved the problem by installing a new NVIDIA driver, the Cudatoolkit and cuDNN on Windows.


  2. I solved the problem by installing a new NVIDIA driver, the Cudatoolkit and cuDNN on Windows.

    Could you please share more details? I have a similar case, this is my driver screenshot.After running nvidia-smi

    Error output:

    $ python3 demo.py --model human-pose-
    
    estimation3d.pth --video /mnt/c/Users/Damian/My Documents/OneDrive - Aalborg Universitet/2_MASTER_VGIS/10_Semester_Master_thesis/ThesisVideos/2023-02-22-people/20230222_100905.mp4 -d GPU
    #### Cannot load fast pose extraction, switched to legacy slow implementation. ####
    Traceback (most recent call last):
      File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/demo.py", line 98, in <module>
        inference_result = net.infer(scaled_img)
                           ^^^^^^^^^^^^^^^^^^^^^
      File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/modules/inference_engine_pytorch.py", line 37, in infer
        features, heatmaps, pafs = self.net(data)
                                   ^^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/mnt/e/Documents/GitHub/thesis/lightweight-human-pose-estimation-3d-demo.pytorch/models/with_mobilenet.py", line 180, in forward
        model_features = self.model(x)
                         ^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
        input = module(input)
                ^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
        input = module(input)
                ^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 463, in forward
        return self._conv_forward(input, self.weight, self.bias)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/damian/anaconda3/envs/lightweight-human-pose-estimation-3d-demo.pytorch/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search