skip to Main Content

Background

I’m working with an ASP.NET Core 9 Web API that takes a (potentially large) file upload from one of our client apps, streams it to a temporary file on the server, does some processing, and then uploads a re-packaged version of the file to blob storage and sends some metadata about it to a database. These are all in Azure (Azure Container Apps, Azure Blob Storage, Azure SQL DB). The request is Content-Type: multipart/form-data, with a single section for the file

Content-Disposition: form-data; name=""; filename="<some_file_name>"

Observation

The problem I observe is that the memory usage of the container roughly follows the size of the file being uploaded, causing the container to run out of memory (see screenshot). I was under the impression that streaming the upload directly to file storage should avoid using more than what’s needed for the stream to buffer.

memory following file upload size

Attempted Solutions

The code mostly follows the example from Upload files in ASP.NET Core, except that 1) there is less case-checking here to keep it simple for testing, and 2) I am constrained to dealing with the stream of the file upload, since the real code will pass along the stream to a client’s library, which will inflate it, process it, etc. This code causes the observed memory problem.

/// <summary>
/// Adds a new Document
/// </summary>
[HttpPost("test", Name = nameof(AddDocumentAsync))]
[DisableFormValueModelBinding]
[DisableRequestSizeLimit]
[ProducesResponseType(StatusCodes.Status201Created)]
[ProducesResponseType(StatusCodes.Status400BadRequest)]
public async Task<ActionResult> AddDocumentAsync()
{
    if ( !HttpContext.Request.HasFormContentType )
        return BadRequest("No file uploaded.");

    string boundary = HttpContext.Request.GetMultipartBoundary();
    if ( string.IsNullOrEmpty(boundary) )
        return BadRequest("Invalid multipart form-data request.");

    MultipartReader multipartReader = new MultipartReader(boundary, HttpContext.Request.Body);
    MultipartSection? section = await multipartReader.ReadNextSectionAsync();

    if ( section == null )
        return BadRequest("No file found in request body.");

    FileMultipartSection? fileSection = section.AsFileSection();

    if ( fileSection?.FileStream == null )
        return BadRequest("Invalid file.");

    string tempDirectory = Path.GetTempPath();
    string tmpPath = Path.Combine(tempDirectory, Path.GetRandomFileName());

    using ( FileStream fs = new FileStream(tmpPath, FileMode.Create) )
        await fileSection.FileStream.CopyToAsync(fs);

    return Created();
}

I observed the file growing in /tmp, but, unfortunately, the memory usage grew at roughly the same rate.

If I change the destination so the file is streamed from the fileSection.FileStream to blob storage instead of to a local file, I do not observe the memory issues.

I also tried using a minimal API with model binding for IFormFile. I saw from here that, by default, if the file is over 64k, it will be buffered to disk, which is what I would want. I noticed the file growing in /tmp, but unfortunately the memory usage grew at the same rate with this solution as well.

I also tried mounting a storage volume for the container, since I wondered if the container was using memory due to the absence of a mounted storage volume. I mounted an Azure Files instance at /blah and changed the destination for the temporary file from /tmp to /blah. I noticed the file correctly was streamed into the Azure Files storage instance, but the memory issue was still observed in this case as well, just like in the others.

Finally, I attempted this same code (the snippet posted above) in an Azure Web Services app and did not observe the memory increase problem. Similarly, I ran the application locally and did not observe my system or process memory increase in the way it did in the Azure Container App.

UPDATE: In response to comments, I also tried downloading a file from blob storage to the container app. This also causes the memory usage of the container to increase according to the size of the file being downloaded. The code snippet below was used.

[HttpGet("test", Name = nameof(TestDocumentAsync))]
[ProducesResponseType(StatusCodes.Status200OK)]
public async Task<ActionResult> TestDocumentAsync()
{
    string tempDirectory = Path.GetTempPath();
    string tmpPath = Path.Combine(tempDirectory, Path.GetRandomFileName());

    BlobClient blobClient = _blobContainerClient.GetBlobClient("c1f04a61-5ec3-43a8-b7ad-de51ae5185bb.tmp");
    using ( FileStream fs = new FileStream(tmpPath, FileMode.Create) )
        await blobClient.DownloadToAsync(fs);

    return Ok();
}

Question

I assume I am misunderstanding or mis-using something here. What is the proper way to stream a large (1GB to ?GB) multipart/form-data file upload to temporary storage for processing and subsequent deletion when dealing with Azure Container Apps and ASP.NET? Or how can the memory usage be explained, even with a simple download from blob storage?

2

Answers


  1. One thing you could try is to Flush the Stream every time the buffer is written to disk.

    public async Task WriteStreamToFileWithFlushAsync(Stream inputStream, string filePath)
    {
        byte[] buffer = new byte[8192];
        int bytesRead;
    
        using (FileStream outputStream = new FileStream(filePath, FileMode.Create, FileAccess.Write))
        {
            while ((bytesRead = await inputStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
            {
                await outputStream.WriteAsync(buffer, 0, bytesRead);
                await outputStream.FlushAsync();
            }
        }
    }
    
    Login or Signup to reply.
  2. Thanks @Magnus As Magnus suggested streaming directly to Azure Blob Storage and saving to temporary local storage with chunked processing is good approach but here you wanted to open the SQLite connection to the file.

    You can’t directly open an SQLite connection to a file stored in Azure Blob Storage as if it were a local file.

    • Download the SQLite file from Blob Storage to a local file system (into the container or VM), open it, process it.
    using Azure.Storage.Blobs;
    using System.Data.SQLite;
    using System.IO;
    
    // Define your Blob Storage connection details
    string connectionString = "<your_connection_string>";
    string containerName = "<container_name>";
    string blobName = "<sqlite_file_name>";
    
    // Create the BlobClient to download the SQLite file
    BlobClient blobClient = new BlobClient(connectionString, containerName, blobName);
    
    // Define a temporary local path to store the downloaded SQLite file
    string tempFilePath = Path.Combine(Path.GetTempPath(), blobName);
    
    // Download the SQLite file to the local temp path
    await blobClient.DownloadToAsync(tempFilePath);
    
    // Open the SQLite connection to the downloaded file
    using (SQLiteConnection conn = new SQLiteConnection($"Data Source={tempFilePath}"))
    {
        await conn.OpenAsync();
    
        // Perform operations on the SQLite database
        // Example: Execute a query
        using (SQLiteCommand command = new SQLiteCommand("SELECT * FROM YourTable", conn))
        {
            using (SQLiteDataReader reader = await command.ExecuteReaderAsync())
            {
                while (await reader.ReadAsync())
                {
                    Console.WriteLine(reader.GetString(0));  // Example: read first column as string
                }
            }
        }
    }
    
    • Next steps is like cleaning up resources, uploading the processed file to a storage solution (e.g., Blob Storage), and possibly updating the database or notifying clients about the result of the file processing.

    The below controller handles the large file uploads and streams them directly to a temporary file.

    Complete code:

    using Microsoft.AspNetCore.Http.Features;
    using Microsoft.AspNetCore.Mvc;
    using Microsoft.Net.Http.Headers;
    using Microsoft.AspNetCore.WebUtilities;
    
    namespace LargeFileUploadApi.Controllers
    {
        [ApiController]
        [Route("api/files")]
        public class FileUploadController : ControllerBase
        {
            private const int BufferSize = 81920; // 80KB buffer size for CopyToAsync
            private readonly ILogger<FileUploadController> _logger;
    
            public FileUploadController(ILogger<FileUploadController> logger)
            {
                _logger = logger;
            }
    
            [HttpPost("upload")]
            [DisableFormValueModelBinding] // Prevent ASP.NET from buffering form values
            [RequestSizeLimit(long.MaxValue)] // Allow very large uploads
            public async Task<IActionResult> UploadLargeFileAsync()
            {
                _logger.LogInformation("File upload started...");
    
                // Ensure the request has multipart content type
                if (!Request.HasFormContentType || !MediaTypeHeaderValue.TryParse(Request.ContentType, out var contentType))
                {
                    return BadRequest("Invalid Content-Type. Expected multipart/form-data.");
                }
    
                // Extract the boundary
                var boundary = MultipartRequestHelper.GetBoundary(contentType, int.MaxValue);
                var multipartReader = new MultipartReader(boundary, Request.Body);
    
                MultipartSection section;
                while ((section = await multipartReader.ReadNextSectionAsync()) != null)
                {
                    // Check if the section contains a file
                    var contentDisposition = section.GetContentDispositionHeader();
                    if (contentDisposition == null || !contentDisposition.DispositionType.Equals("form-data") || string.IsNullOrEmpty(contentDisposition.FileName.Value))
                        continue;
    
                    // Create a temporary file path
                    var tempFilePath = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName());
    
                    try
                    {
                        // Stream the file directly to disk
                        await using (var targetStream = System.IO.File.Create(tempFilePath))
                        {
                            _logger.LogInformation($"Writing file to temporary path: {tempFilePath}");
                            await section.Body.CopyToAsync(targetStream, BufferSize);
                        }
    
                        // Process the file (example: simulate processing time)
                        await ProcessFileAsync(tempFilePath);
    
                        _logger.LogInformation("File processed successfully.");
    
                        // Clean up: delete the temporary file
                        System.IO.File.Delete(tempFilePath);
                        _logger.LogInformation("Temporary file deleted.");
                    }
                    catch (Exception ex)
                    {
                        _logger.LogError(ex, "Error during file upload or processing.");
                        return StatusCode(500, "An error occurred while processing the file.");
                    }
                }
    
                return Ok(new { message = "File uploaded and processed successfully." });
            }
    
            private async Task ProcessFileAsync(string filePath)
            {
                _logger.LogInformation($"Processing file: {filePath}");
    
                // Simulate file processing (replace this with real logic)
                await Task.Delay(2000); // Simulating some processing time
    
                _logger.LogInformation($"File {filePath} processing complete.");
            }
        }
    
        // Helper class to get multipart content disposition
        internal static class MultipartRequestHelper
        {
            public static string GetBoundary(MediaTypeHeaderValue contentType, int lengthLimit)
            {
                var boundary = HeaderUtilities.RemoveQuotes(contentType.Boundary).Value;
                if (string.IsNullOrWhiteSpace(boundary))
                {
                    throw new InvalidDataException("Missing content-type boundary.");
                }
    
                return boundary;
            }
    
            public static ContentDispositionHeaderValue? GetContentDispositionHeader(this MultipartSection section)
            {
                if (ContentDispositionHeaderValue.TryParse(section.ContentDisposition, out var contentDisposition))
                {
                    return contentDisposition;
                }
    
                return null;
            }
        }
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search