skip to Main Content

I have been trying to find the best way to do the following:
I need to move a large amount of json files that are named following the format "yyyymmdd-hhmmss.json" from one blob container to another that’s in another storage account. These files are nested inside several different folders.

I only have to move the files that were created (or are named) before a certain date, for example: move all files that were created/are named before 01/01/2022.

What would be the best way to do so quickly? This is a one-time migration so it won’t be recurring.

2

Answers


  1. You can iterate each blob in the source container (No matter how the folder structure is, as blob folders are simply virtual), and you can parse the name of the blob to filter blobs matching the pattern "yyyymmdd-hhmmss" and find the date and if it is older than the date that you wish to choose as a condition, you can easily copy the blob from your source to destination container, and finally delete the blob from the source container. Not sure about power shell, but its easy with any supported programming language.

    Here’s an example of doing this with .Net:

    BlobContainerClient sourceContainerClient = new BlobContainerClient("<source-connection-string>", "<source-container-name>");
    BlobContainerClient destinationContainerClient = new BlobContainerClient("<destination-connection-string>", "<destination-container-name>");
    var blobList = sourceContainerClient.GetBlobs();
    DateTime givenDateTime = DateTime.Now;
    foreach (var blobItem in blobList)
    {
        try
        {
            var sourceBlob = sourceContainerClient.GetBlobClient(blobItem.Name);
            string blobName = sourceBlob.Uri.Segments.Last().Substring(0, sourceBlob.Uri.Segments.Last().IndexOf('.'));
    
            if (DateTime.Compare(DateTime.ParseExact(blobName, "yyyyMMdd-hhmmss", CultureInfo.InvariantCulture), givenDateTime) < 0)
            {
                var destinationBlob = destinationContainerClient.GetBlockBlobClient(blobName);
                destinationBlob.StartCopyFromUri(sourceBlob.Uri);
                sourceBlob.Delete();
            }
        }
        catch { }
    }
    
    Login or Signup to reply.
  2. To copy files in bulk from a Source to a Destination Blob Container:

    Connect-AzAccount
    
    Get-AzSubscription 
    Select-AzSubscription -Subscription "My Subscription"
      
    $srcResourceGroupName = "RG-DEMO-WE"
    $srcStorageAccountName = "storageaccountdemowe"
    $srcContainer = "sourcefolder"
    $blobName = "dataDisk.vhd"
    
    $destResourceGroupName = "RG-TRY-ME"
    $destStorageAccountName = "storageaccounttryme"
    $destContainer = "destinationfolder"
     
    # Set Source & Destination Storage Keys and Context
    $srcStorageKey = Get-AzStorageAccountKey -Name $srcStorageAccountName -ResourceGroupName $srcResourceGroupName 
     
    $destStorageKey = Get-AzStorageAccountKey -Name $destStorageAccountName -ResourceGroupName $destResourceGroupName
     
    $srcContext = New-AzStorageContext -StorageAccountName $srcStorageAccountName -StorageAccountKey $srcStorageKey.Value[0]
     
    $destContext = New-AzStorageContext -StorageAccountName $destStorageAccountName -StorageAccountKey $destStorageKey.Value[0]
    
    # Optional step 
    New-AzStorageContainer -Name $destContainer  -Context $destContext   
    
    # The copy operation 
    $copyOperation = Start-AzStorageBlobCopy -SrcBlob $blobName `
                                             -SrcContainer $srcContainer `
                                             -Context $srcContext `
                                             -DestBlob $blobName `
                                             -DestContainer $destContainer `
                                             -DestContext $destContext
     
    

    REF: https://www.jorgebernhardt.com/copy-blob-powershell/

    Since you need to do individual files based on Date, instead of the Start-AzStorageBlobCopy the best is following the Microsoft Documentation with Async az storage file copy:

    az storage file copy start --destination-path
                               --destination-share
                               [--account-key]
                               [--account-name]
                               [--connection-string]
                               [--file-endpoint]
                               [--file-snapshot]
                               [--metadata]
                               [--sas-token]
                               [--source-account-key]
                               [--source-account-name]
                               [--source-blob]
                               [--source-container]
                               [--source-path]
                               [--source-sas]
                               [--source-share]
                               [--source-snapshot]
                               [--source-uri]
                               [--timeout]
    

    REF: https://learn.microsoft.com/en-us/cli/azure/storage/file/copy?view=azure-cli-latest

    The code to loop through the files based on date I’ll leave to the reader, eg:

    Get-ChildItem | Where-Object {$_.LastWriteTime -lt (Get-Date).AddDays(-30)}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search