I need to upload a 200 mb file to adls using python.
I’m using the code provided in the official documentation – https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python?tabs=azure-ad
While calling the following function for upload –
def upload_file_to_directory_bulk():
try:
file_system_client = service_client.get_file_system_client(file_system="system")
directory_client = file_system_client.get_directory_client("my-directory")
file_client = directory_client.get_file_client("uploaded-file.txt")
local_file = open("C:\file-to-upload.txt",'r')
file_contents = local_file.read()
file_client.upload_data(file_contents, overwrite=True)
except Exception as e:
print(e)
It works for small files
I get the error – ('Connection aborted.', timeout('The write operation timed out'))
when I try to upload larger files like 200 mb.
How to resolve this?
2
Answers
This must be related to the upload speed. Try increasing the timeout to 60 seconds. Also if you split the file in chunks a separate connection(with separate timeout) will be created for each chunk.
With chunk size:
You need to increase the timeout value and chunk size in your code while uploading large data.
You can increase the timeout by 60 seconds refer below:-
and for chunk size add this line:-
Output:-