I am working on object detection for a school project. To train my CNN model I am using a google cloud server because I do not own a strong enough GPU to train it locally.
The training data consists of images (.jpg files) and annotations (.txt files) and is spread over around 20 folders due to the fact that they come from different sources and I do not want to mix pictures from different sources so I want to keep this directory structure.
My current issue is that I could not find a fast way of uploading them to my google cloud server.
My workaround was to upload those image folders as a .zip file on google drive and download them on the cloud and unzip them there. This process needs way too much time because I have to upload many folders and google drive does not have a good API to download folders to Linux.
On my local computer, I am using Windows 10 and my cloud server runs Debian.
Therefore, I’d be really grateful if you know a fast and easy way to either upload my images directly to the server or at least to upload my zipped folders.
3
Answers
Couldn’t you just create an infinite loop to look for jpg files and scp/sftp the jpg directly to your server once the file is there? On windows, you can achieve this using WSL.
(sorry this may not be your final answer, but i don’t have the reputation to ask you this question)
I would upload them to a Google Cloud Storage bucket using gsutil with
multithreading
. This means that multiple files are copied at once, so the only limitation here is your internet speed. Gsutil installers for Windows and Linux are found here. Example command:Then on the VM you do exactly the opposite:
This is super fast, and you only pay a small amount for the storage, which is super cheap and/or falls within the GCP free tier.
Note: make sure you have write permissions on the storage bucket and the default compute service account (i.e. the VM service account) has read permissions on the storage bucket.
The best stack for the use case will be gsutil + storage bucket
Copy the zip files to cloud storage bucket and put a sync cron to get the files on the VM.
Make use of gsutil
https://cloud.google.com/storage/docs/gsutil