I’m not sure why, but calling this DataLakeDirectoryClient.CreateSubDirectoryAsync
function creates a sub directory and a zero byte file of the same name. I only want it to create the sub directory. Note: I can see the zero byte file in Azure portal’s Storage Browser, but not in Azure Storage Explorer.
myDirectoryClient
is of type Azure.Storage.Files.DataLake.DataLakeDirectoryClient
var mySubDirectoryClient = await myDirectoryClient.CreateSubDirectoryAsync("my_sub_dir");
Now I have the zero byte file, but no directory.
Next, I upload a file to the sub directory. fileName
, mylocalFilePath
are set elsewhere with valid values. The new file creation and upload works fine.
var newFileClient = mySubDirectoryClient.GetFileClient(fileName);
await using var uploadFs = File.OpenRead(mylocalFilePath);
var response = newFileClient.UploadAsync(uploadFs).Result;
Now I have the zero byte file, and the sub directory, and the file in the sub directory. Files/folders in picture have a different name (not "my_sub_dir"
), but they were created the same way.
Is there a reason I have the extra zero byte file? Can I prevent this? Or do I just need to delete it afterwards? Or would deleting it be an issue?
I somewhat understand why the empty file is created, which I believe is that it doesn’t treat it like a directory until it contains a file. Kind of like how when you delete all of the files in a folder the folder disappears. I’d like to create the direct
Edit: Above code is snippets… I am uploading the entire function below, for clarity. This is a recursive function meant to copy all the files and subdirectories from a directory. It downloads a file locally, then uploads it to another location on a datalake, but as far as my question is concerned, the only code that should matter is in the if(path.IsDirectory??false)
async Task CopyDirectory(DataLakeDirectoryClient sourceDirectoryClient, DataLakeDirectoryClient targetDirectoryClient)
{
var pathPages = sourceDirectoryClient.GetPathsAsync();
var tasks = new List<Task>();
await foreach (var page in pathPages.AsPages())
{
foreach (var path in page.Values)
{
var fileName = path.Name.Split("/").Last();
if (path.IsDirectory??false)
{
var sourceSubDirectoryClient = sourceDirectoryClient.GetSubDirectoryClient(fileName);
var targetSubDirectoryClient = await targetDirectoryClient.CreateSubDirectoryAsync(fileName);
await CopyDirectory(sourceSubDirectoryClient, targetSubDirectoryClient, targetFileSystemClient);
//this only returns one path that is a directory per directory, not one zero byte file and one directory
//var x = targetDirectoryClient.GetPathsAsync();
//await foreach (var y in x.AsPages())
//{
// foreach (var z in y.Values)
// {
// Console.WriteLine(z);
// }
//}
}
else if (true) //fileName.Contains("2023")) //filter here
{
var downloadPath = localTempPath + fileName;
var sourceFileClient = sourceDirectoryClient.GetFileClient(fileName);
var properties = await sourceFileClient.GetPropertiesAsync();
tasks.Add(Task.Run(async () => await FileDownloader.DownloadFileAsync(sourceFileClient, downloadPath))
.ContinueWith((result) => FileUploader.UploadFileAsync(targetDirectoryClient, properties, downloadPath))
.ContinueWith(async (result) => await DeleteFilesAsync(downloadPath)) // this cleans up the local file
);
}
}
}
Task.WaitAll(tasks.ToArray());
}
2
Answers
This is more of an explanation of "why it didn't work", rather than an answer.
The root cause is that the storage account I was writing to was not Data Lake Storage. It was a plain old Blob Storage. It was not my account; I was just trying to move some files from my ADLS Gen 2 to their storage account. The code I had worked when I tried to create sub directories on my own ADLS Gen 2 account. Hopefully it can help someone else.
I have reproduced in my environment and below are expected results:
Before creating main and sub folder:
To create Main Folder and Sub Folder with same name you can follow below
code
:Output:
If you want to create a Sub Folder in Existing Folder with same name follow below
code
:Output:
This is how I create folders and subfolders in Storage account
without
getting extra files with 0 bytes.