CUDA error on WSL2 using pytorch with multiprocessing - Ubuntu

dachougui
November 30, 2022
261 views
1 vote
2 Answers

I have a Python script as shown below:

import torch
from torch.multiprocessing import set_start_method, Pipe, Process


def func(conn):
    data = conn.recv()
    print(data)


if __name__ == "__main__":
    set_start_method('spawn')
    a, b = Pipe()
    data = torch.tensor([1, 2, 3], device='cuda')
    proc = Process(target=func, args=(data,))
    proc.start()
    b.send(data)
    proc.join()

I run this script on WSL2, but it shows

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 121, in rebuild_cuda_tensor
    storage = storage_cls._new_shared_cuda(
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/storage.py", line 807, in _new_shared_cuda
    return torch.UntypedStorage._new_shared_cuda(*args, **kwargs)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)

My environment is:

OS：WSL2 Ubuntu 22.04
CUDA: 11.7
Python: 3.8
Pytorch: 1.13.0+cu117

Any idea on how to solve this issue?

Thanks.

I’ve run this script on Ubuntu 22.04 without WSL2, it’s OK.

Answers

- refiksoyak
- January 7, 2023 at 7:11 am
- 0 votes
0
Try moving the data to GPU inside your function instead of creating it in GPU directly. That worked for me.

Login or Signup to reply.

- anonymous_user
- March 15, 2023 at 3:47 pm
- 0 votes
0
This worked for me:
```
import warnings
warnings.filterwarnings('ignore')
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

CUDA error on WSL2 using pytorch with multiprocessing – Ubuntu

Answers