I am trying to test deep neural networks with Torch==2.0.0cu+117
.
-
I use WSL2 Ubuntu on Windows 10
-
I have NVIDIA T1200 Laptop GPU and I have installed a compatible version of driver on Windows
-
I have installed WSL as administrator using
wsl --install
andwsl --update
and some required libraries -
On WSL2, When I tap
nvidia-smi
, I get this
driver NVIDIA Info -
I have installed Cudatoolkit version 12.1 with local repo (.deb) and have used
export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}}
and
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
-
I have also installed CuDNN version 8.9.1.23
-
And then I Install Anaconda (
Anaconda3-2020.02-Linux-x86_64
) and create an conda environment withconda create -n my_env python=3.8 cython
. I have install torch version 2.0.0cu+117 and other python required libraries.
When I compile my script on my environment, I get this error message :
Traceback (most recent call last):`
File "test.py", line 169, in <module>
tester.infer()
File "test.py", line 103, in infer
pred = self.model(pts.cuda(), None)
File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/networks.py", line 47, in forward
_, y0 = self.cv_in(x_in, x_in, y_in, 16)
File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/c/Users/9822223E/Stage3A/WSL_ressources/model_init/models/kernels.py", line 253, in forward
y_out = self.conv(y_out.transpose(2,1)).view(batch_size, n_pts_out, self.out_features)
File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 313, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/amadou/anaconda3/envs/cyconvlite/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
I have tried to reinstall WSL, CuDNN and recreated the conda environment, but the problem still persists.
2
Answers
I solved the problem by installing a new NVIDIA driver, the Cudatoolkit and cuDNN on Windows.
Could you please share more details? I have a similar case, this is my driver screenshot.After running
nvidia-smi
Error output: