Ubuntu – Pytorch CPU OOM kills ssh server on linux
I've run into a problem that pytorch (tested with 2.0.1+cu117) does not fail gracefully when CPU OOM occurs. Specifically, I lose all ssh connections and Xserver access to the VM or bare metal machine. I've not tested if this occurs…