Upon request, my ASP.NET server should convert an HTML file to PDF using a chrome headless instance and return the resulting PDF.
CMD command:
chrome --headless --disable-gpu --print-to-pdf-no-header --print-to-pdf="[pdf-file-path]" --no-margins "[html-file-path]"
The PDF file is not trivial to deal with. The server needs to cleanup the PDF file from the previous request, needs to detect when the new PDF is created, and then read the file into the memory. All this is just too slow.
Is there a better solution to this? Could I get the file directly into memory somehow? Or manage the PDF file better?
2
Answers
Quit using chrome through the command-line interface and use Chrome web drivers on C# like Selenium or Puppeteer instead. For Selenium, use the following NuGet:
https://www.nuget.org/packages/Selenium.WebDriver/4.0.0-rc2
Then you can print your HTML into PDF using the following code:
I would consider several options.
Print output to a PostScript printer.
Then take the PostScript and say use GhostScript to output a PDF.
Probably even better? use the .net pdfSharp library, and then a some code to render HTML based on that library.
Consider this:
https://www.nuget.org/packages/HtmlRenderer.PdfSharp/1.5.1-beta1