I am using Google Colab in combination with a Custom GCE VM based on the instructions here. I now need a way to retrieve files from the VM without using the Colab interface due to a bug described in this issue and this issue. I’ve reviewed the answers from this similar question about file storage on hosted instances, but I don’t think it helps me in this case.
I’ve attempted to SSH into the machine to find files, but I can’t locate the /content
directory that I expect to see in root
. After digging through the file system I found the /mnt/stateful_partition/var/lib/docker
directory is using the amount of disk space I expect to reflect the size of the data with a file object called colab-vmdisk
that looks promising. I’m not sure how to proceed, but given the file path I expect there’s a docker-based solution here that I don’t know.
2
Answers
Google Colab from GCE is in its own docker container as you found. If you want to access the files in the google colab session, run
docker ps
and copy the container id from the bottom row. As for copying a file over, dodocker cp (your container id):/path/to/google/colab/folder/ /path/to/gce/
@hidude562’s answer is on point. It worked for me. I’ve been trying to figure out a method with good download speeds (as fast as when I was using gdrive with Colab on their hosted runtime)
Colab seems to be managing the entire thing within a docker container, as you rightly mentioned @s_go. It also explains how they keep the popular libraries updated right from the start, including the gdown library. I figured it’s best to use gdown to download large files into Colab from Gdrive; as google doesnt let you mount your personal gdrive to Colab when using a custom GCE VM runtime, due to some authorisation blockers. This method downloads files into colab at full speed Google is capable of (I’ve seen upto ~500mbps)
Adding on, after extracting the file from the Docker file, I used FileZilla SFTP to download the file to my local. It was as fast as expected, direct download from the SSH was around ~100kbps for some reason, with FileZilla on the same VM I got download speeds of upto ~13mbps (my wifi dl bandwidth is about ~25mbps)
Hope this comment validates @hidude562’s answer for other readers.!
Thank you for your question @s_go and your answer @hidude562!:)