Currently, I am using the list_blobs()
function in the Azure Python SDK to list all of the blobs within a container. However, in terms of the metadata/info of the blobs, I only require the names of the blobs.
In my container, there are over 1M blobs, and executing the following to access the name of each blob is not very efficient, since list_blobs()
has to retrieve a lot of info on each blob (1M+ total) in its response, and this process takes over 15 minutes to complete:
blobs = container_client.list_blobs()
for blob in blobs:
print(blob.name)
I am looking to decrease the time it takes to execute this block of code, and I was wondering if there is any way to retrieve all of the blobs in the container using list_blobs()
, but only retrieving the ‘name’ property of the blobs, rather than retrieving info about every single property of each blob in the response.
2
Answers
It is not possible to retrieve only some of the properties of the blob (like name).
list_blobs
method is the implementation ofList Blobs
REST API operation which does not support server-side projection.You can use
container_client.list_blob_names()
for this, it will return an iterator with the names of blobs in the container.Or store it in a list:
https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.containerclient?view=azure-python#azure-storage-blob-containerclient-list-blob-names