skip to Main Content

Upon request, my ASP.NET server should convert an HTML file to PDF using a chrome headless instance and return the resulting PDF.

CMD command:

chrome --headless --disable-gpu --print-to-pdf-no-header --print-to-pdf="[pdf-file-path]" --no-margins "[html-file-path]"

The PDF file is not trivial to deal with. The server needs to cleanup the PDF file from the previous request, needs to detect when the new PDF is created, and then read the file into the memory. All this is just too slow.

Is there a better solution to this? Could I get the file directly into memory somehow? Or manage the PDF file better?

2

Answers


  1. Chosen as BEST ANSWER

    Quit using chrome through the command-line interface and use Chrome web drivers on C# like Selenium or Puppeteer instead. For Selenium, use the following NuGet:

    https://www.nuget.org/packages/Selenium.WebDriver/4.0.0-rc2

    Then you can print your HTML into PDF using the following code:

    // Base 64 encode
    var textBytes = Encoding.UTF8.GetBytes(html);
    var b64Html = Convert.ToBase64String(textBytes);
    
    // Create driver
    var chromeOptions = new ChromeOptions();
    chromeOptions.AddArguments(new List<string> { "no-sandbox", "headless", "disable-gpu" });
    using var driver = new ChromeDriver(webdriverPath, chromeOptions);
    // Little bit magic here. Refer to: https://stackoverflow.com/a/52498445/7279624
    driver.Navigate().GoToUrl("data:text/html;base64," + b64Html);
    
    // Print
    var printOptions = new Dictionary<string, object> {
        // Docs: https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF
        { "paperWidth", 210 / 25.4 },
        { "paperHeight", 297 / 25.4 },
    };
    var printOutput = driver.ExecuteChromeCommandWithResult("Page.printToPDF", printOptions) as Dictionary<string, object>;
    var document = Convert.FromBase64String(printOutput["data"] as string);
    

  2. I would consider several options.

    Print output to a PostScript printer.

    Then take the PostScript and say use GhostScript to output a PDF.

    Probably even better? use the .net pdfSharp library, and then a some code to render HTML based on that library.

    Consider this:

    https://www.nuget.org/packages/HtmlRenderer.PdfSharp/1.5.1-beta1

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search