skip to Main Content

I’m using boto3 to copy a large object from one bucket to another (3.5 GiB), I’m using the following code:

boto3.client('s3').copy_object(Bucket=dst_bucket, Key=filepath, CopySource={'Bucket': src_bucket, 'Key': filepath})

It works fine, but it takes ~4-5 minute, I don’t want to wait around for the copy to be finished I’d rather just initiate the copy and stop the script.

How can I do that ? I thought about launching the copy in a thread and exiting the program after 2second, but it doesn’t feel right, surely boto3/aws has a way to do what I’m trying to do ?

2

Answers


  1. Chosen as BEST ANSWER

    I've solve the issue using the information on these posts:

    Turns out boto3 is doing the copy in a really dumb way

    Here is the final code for reference:

        botocore_config = botocore.config.Config(max_pool_connections=COPY_WORKERS)
        s3client = boto3.Session().client('s3', config=botocore_config)
        transfer_config = s3transfer.TransferConfig(
            use_threads=True,
            max_concurrency=COPY_WORKERS,
        )
        s3t = s3transfer.create_transfer_manager(s3client, transfer_config)
        s3t.copy(
            bucket=dst_bucket, 
            key=filepath,
            copy_source={'Bucket': src_bucket, 'Key': filepath},
        )
        s3t.shutdown()
    

    It's not returning immediately but it only takes ~20s which is acceptable. I can have my script wait that long

    EDIT: Although I'm satisfied with my final code, this doesn't really answer the question which was how to launch it and not wait around. So I will not mark my post as an answer


  2. I would use another thread. The wait is necessary because of its return output. Have you considered using the terminal instead, something in the lines of :
    aws s3 cp s3://source-bucket/your-file-path s3://destination-bucket/your-file-path &

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search