Avoid OutOfMemoryError When Download Large Files From Azure Blob Storage

GonzaloLe243n
March 7, 2023
215 views
0 votes
2 Answers

I’m getting a java.lang.OutOfMemoryError: when i try to donwload large files (>200MB) from the web application that i working on.

The flow to donwload is the next:

1.- Main method:

public byte[] getFileBytes(@RequestBody ZeusRequestVO<String> request) {
    return documentService.downloadFileByChunks(request).toByteArray();
}

2.- Donwload logic:

public ByteArrayOutputStream downloadFileByChunks(String blobName) {
    long file_size = 0;
    long chunkSize = 10 * 1024 * 1024;
    CloudBlockBlob blob = connectAndgetCloudBlockBlob(blobName);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    try {
        if (blob.exists()) {
            blob.downloadAttributes();
            file_size = blob.getProperties().getLength();
            for (long i = 0; i < file_size; i += chunkSize) {
                blob.downloadRange(i, file_size, baos);
            }
        }
    } catch (StorageException e) {
        throw new GenericException(e, BussinesErrorEnum.AZURE_BLOB_STORAGE_EXCEPTION);
    }
    return baos;
}

I already add -Xms and -Xmx config in my app and that works while files not passed 200MB, in fact initially the web app wasn’t capable to donwload files larges than 30MB until the -Xms and -Xmx configuration was added.

I see a solution here but i’m not able to update or add more libraries than existing (company policies).

Any advices?

Answers

Chosen as BEST ANSWER

Finally solved, my solution was a mix from Ramprasad's answer and this answer from another question.

Basically the solution download the file in a physical location, in this case in the cluster where the app lives.

I change the main method:

public byte[] getFileBytes(@RequestBody ZeusRequestVO<String> request) {
return documentService.downloadFileByChunks(request).toByteArray();
}

For this:

public void getFileBytes(@RequestBody ZeusRequestVO<String> request, HttpServletResponse response) {
    FileInputStream fis = null;
    String fileName = null;
    try {
        fileName = documentService.getFileBytes(request);
        fis = new FileInputStream(fileName);
        IOUtils.copy(fis, response.getOutputStream());
        response.flushBuffer();
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            if(ValidationUtil.isNotNull(fis)) {
                fis.close();
            }
            File file = new File(fileName);
            if(file.exists()) {
                file.delete();
            }
        } catch(IOException ioe) {
            ioe.printStackTrace();
        }
    }
}

The logic is download the file in a physical location, then it's downloaded through the browser from here avoiding memory leaks and finally the created file is deleted.

(Edit)

- Ramprasad
- March 17, 2023 at 5:49 am
- 0 votes
0
One way to avoid this error is to download the file in smaller chunks and write it directly to disk instead of keeping it all in memory. You can do this by replacing the ByteArrayOutputStream with a FileOutputStream that writes to a temporary file on disk. Here’s an example:
```
public void downloadFileByChunks(String blobName, String filePath) {
    long chunkSize = 10 * 1024 * 1024;
    CloudBlockBlob blob = connectAndgetCloudBlockBlob(blobName);
    try (FileOutputStream fos = new FileOutputStream(filePath)) {
        if (blob.exists()) {
            blob.downloadAttributes();
            long fileSize = blob.getProperties().getLength();
            for (long i = 0; i < fileSize; i += chunkSize) {
                blob.downloadRange(i, Math.min(chunkSize, fileSize - i), fos);
            }
        }
    } catch (StorageException | IOException e) {
        throw new GenericException(e, BussinesErrorEnum.AZURE_BLOB_STORAGE_EXCEPTION);
    }
}
```
This method takes in an additional parameter filePath, which specifies where on disk to save the downloaded file. The method downloads the file in chunks and writes each chunk directly to disk using a FileOutputStream. This way, you can avoid keeping the entire file in memory and reduce the risk of running into an OutOfMemoryError.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.