sumit
12/31/2019 - 12:06 AM

download large datset using google colabs and wget, curl

from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/My\ Drive/
!ln -s "/content/drive/My Drive" /gdrive

pip library for download drive files

!pip install -q googleDriveFileDownloader
from googleDriveFileDownloader import googleDriveFileDownloader
a = googleDriveFileDownloader()
fileid = "1d2CNNnqO4b6tatbeIGltXaByxgoeOG5H"
a.downloadFile(f'https://drive.google.com/uc?id={fileid}&export=download')

reflecting all files from vm of colab to google drive and unmounting after that

drive.flush_and_unmount() print('All changes made in this colab session should now be visible in Drive.')

#!wget https://medusa.fit.vutbr.cz/traffic/data/2016-ITS-BrnoCompSpeed-full.tar #!tar -xvf 2016-ITS-BrnoCompSpeed-full.tar.1
!pwd !curl https://medusa.fit.vutbr.cz/traffic/data/2016-ITS-BrnoCompSpeed-full.tar | tar xv

Download big files from AWS

!wget --no-check-certificate --no-proxy https://s3.eu-west-1.amazonaws.com/mapillary.deepnets/data/mapillary/traffic_sign/mtsd/mtsd_fully_annotated.zip?AWSAccessKeyId=&Signature=&Expires='

DOWNLOAD BIG FILES DIRECTLY using google drive https://medium.com/@acpanjan/download-google-drive-files-using-wget-3c2c025a8b99

just change fileID with your fileid, and change FILENAME to filename

just remember to share with

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O- | sed -rn 's/.confirm=([0-9A-Za-z_]+)./\1\n/p')&id=FILEID" -O FILENAME && rm -rf /tmp/cookies.txt