skip to Main Content

I have a Python script as shown below:

import torch
from torch.multiprocessing import set_start_method, Pipe, Process


def func(conn):
    data = conn.recv()
    print(data)


if __name__ == "__main__":
    set_start_method('spawn')
    a, b = Pipe()
    data = torch.tensor([1, 2, 3], device='cuda')
    proc = Process(target=func, args=(data,))
    proc.start()
    b.send(data)
    proc.join()

I run this script on WSL2, but it shows

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 121, in rebuild_cuda_tensor
    storage = storage_cls._new_shared_cuda(
  File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/storage.py", line 807, in _new_shared_cuda
    return torch.UntypedStorage._new_shared_cuda(*args, **kwargs)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)

My environment is:

  • OS:WSL2 Ubuntu 22.04
  • CUDA: 11.7
  • Python: 3.8
  • Pytorch: 1.13.0+cu117

Any idea on how to solve this issue?

Thanks.

I’ve run this script on Ubuntu 22.04 without WSL2, it’s OK.

2

Answers


  1. Try moving the data to GPU inside your function instead of creating it in GPU directly. That worked for me.

    Login or Signup to reply.
  2. This worked for me:

    import warnings
    warnings.filterwarnings('ignore')
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search