skip to Main Content

I have created the VM using GCP Console in browser.

While creating VM, I selected the VM Image as "c2-deeplearning-pytorch-1-8-cu110-v20210619-debian-10". Also, I selected GPU as T4.

VM gets created and started and it shows green icon in browser.

Then I try to connect from "gcloud compute ssh " and it asks if I want to install nVidia Driver and I do Y, then it gives error for lock file and driver is not installed as:

This VM requires Nvidia drivers to function correctly. Installation
takes ~1 minute. Would you like to install the Nvidia driver? [y/n] y
Installing Nvidia driver. install linux headers:
linux-headers-4.19.0-16-cloud-amd64 E: dpkg was interrupted, you must
manually run ‘sudo dpkg –configure -a’ to correct the problem.
Nvidia driver installed.

I try to verify if driver is installed by running python code as:

import torch
torch.cuda.is_available() #returns False.

Anybody else faced this issue?

3

Answers


  1. Chosen as BEST ANSWER

    Solution to my problem was:

    • Run manually : sudo dpkg --configure -a
    • Disconnect from machine.
    • Connect again using SSH. Select Y again when asked to install nVidia Driver.

    It works then.


  2. This is the correct way to install NVIDIA driver on a GCP instance:

    cd /
    
    sudo apt purge nvidia-*
    

    Reboot

    cd /
    
    sudo wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
    sudo sh cuda_11.2.2_460.32.03_linux.run
    

    Adjust your config accordingly as it pops options in the terminal

    Reboot

    Login or Signup to reply.
  3. Make sure you are running as root. I know this sounds silly, but if you use their notebook instances the default user is not root and if you try to ssh into the instance and run something like gpustat etc or run custom code, you might get errors like NVIDIA drivers are not loaded or such.

    If you make sure your user (which is called jupyter in the default case) is in the sudoers then all will work fine.

    It is often very complicated to install or reinstall GPU drivers on GCP instances. Make sure you actually need to reinstall before you attempt other solutions.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search