I need to upload a 200 mb file to adls using python.
I’m using the code provided in the official documentation – https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python?tabs=azure-ad
While calling the following function for upload –
def upload_file_to_directory_bulk():
file_system_client = service_client.get_file_system_client(file_system="system")
directory_client = file_system_client.get_directory_client("my-directory")
file_client = directory_client.get_file_client("uploaded-file.txt")
local_file = open("C:\file-to-upload.txt",'r')
file_contents = local_file.read()
file_client.upload_data(file_contents, overwrite=True)
except Exception as e:
It works for small files
I get the error – ('Connection aborted.', timeout('The write operation timed out'))
when I try to upload larger files like 200 mb.
How to resolve this?
This must be related to the upload speed. Try increasing the timeout to 60 seconds. Also if you split the file in chunks a separate connection(with separate timeout) will be created for each chunk.
With chunk size:
You need to increase the timeout value and chunk size in your code while uploading large data.
You can increase the timeout by 60 seconds refer below:-
and for chunk size add this line:-