I have a Python script as shown below:
import torch
from torch.multiprocessing import set_start_method, Pipe, Process
def func(conn):
data = conn.recv()
print(data)
if __name__ == "__main__":
set_start_method('spawn')
a, b = Pipe()
data = torch.tensor([1, 2, 3], device='cuda')
proc = Process(target=func, args=(data,))
proc.start()
b.send(data)
proc.join()
I run this script on WSL2, but it shows
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 121, in rebuild_cuda_tensor
storage = storage_cls._new_shared_cuda(
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/storage.py", line 807, in _new_shared_cuda
return torch.UntypedStorage._new_shared_cuda(*args, **kwargs)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
My environment is:
- OS:WSL2 Ubuntu 22.04
- CUDA: 11.7
- Python: 3.8
- Pytorch: 1.13.0+cu117
Any idea on how to solve this issue?
Thanks.
I’ve run this script on Ubuntu 22.04 without WSL2, it’s OK.
2
Answers
Try moving the data to GPU inside your function instead of creating it in GPU directly. That worked for me.
This worked for me: