skip to Main Content

I am trying to use s3 link provided to me https://ml-cloud-dataset.s3.amazonaws.com/Airlines_data.txt in putty terminal. So that I can create table in hive and load the dataset into it.

I tried to download data set using code:

aws s3 cp https://ml-cloud-dataset.s3.amazonaws.com/Airlines_data.txt /home/hadoop . 

This code gave me error and I tried using multiple ways still failed to get the data.

2

Answers


  1. If you use aws s3 cp, you need to have aws cli installed. If you have it installed, you can upload file using

    aws s3 cp myfilename.txt s3://mybucketname/mypath/myfilename.txt
    

    To download the file, you can use

    aws s3 cp s3://mybucketname/mypath/myfilename.txt myfilename.txt
    

    Depending on your aws setup, you need to either access keys or use sso for login. If the machine is in EC2, you can also IAM roles which will let you login without sso or access keys.

    Login or Signup to reply.
  2. The URL https://ml-cloud-dataset.s3.amazonaws.com/Airlines_data.txt is saying:

    • The bucket name is ml-cloud-dataset
    • There is an object called Airlines_data.txt

    Fortunately, it is a publicly accessible bucket, so you can list the contents with the AWS CLI:

    $ aws s3 ls ml-cloud-dataset
    
    2020-03-06 23:32:55   10237044 Airlines_data.txt
    2020-03-06 23:33:15         84 dept
    2020-03-06 23:33:15        218 employee
    2020-03-06 23:33:15       1666 hive_key.cer
    2020-03-06 23:33:15      22628 u.user
    

    You can copy the object to your own bucket using:

    aws s3 cp s3://ml-cloud-dataset/Airlines_data.txt s3://your-bucket/
    

    To copy ALL the objects, use:

    aws s3 sync s3://ml-cloud-dataset/ s3://your-bucket/
    

    However, if you are using Hive within AWS you possibly don’t even need to download the files — you could just reference it directly using s3://ml-cloud-dataset/Airlines_data.txt.

    You could also access it from Amazon Athena using that same path.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search