skip to Main Content

I have what seems like it should be a simple problem – I would like to be able to fill out PDF forms programmatically in javascript.

My first attempt was with pdf-lib which has a nice API for filling out forms – however, the form I am trying to fill out has fields like this:

{
...
"employment.employment": [
  {
    id: '243R',
    value: 'Off',
    defaultValue: null,
    exportValues: 'YES',
    editable: true,
    name: 'employment',
    rect: [ 264.12, 529.496, 274.23, 539.604 ],
    hidden: false,
    actions: null,
    page: -1,
    strokeColor: null,
    fillColor: null,
    rotation: 0,
    type: 'checkbox'
  },
  {
    id: '244R',
    value: 'Off',
    defaultValue: null,
    exportValues: 'NO',
    editable: true,
    name: 'employment',
    rect: [ 307.971, 529.138, 318.081, 539.246 ],
    hidden: false,
    actions: null,
    page: -1,
    strokeColor: null,
    fillColor: null,
    rotation: 0,
    type: 'checkbox'
  }
]
}

which pdf-lib fails to parse properly. It will only allow me to set the value of 243R, treating 244R as if it doesn’t exist (I assume because the names are not unique). That library also seems abandoned. C’est la vie.

Onward to pdf.js then. I can load the doc and set the value, but calling saveDocument or getData only returns the original, non-modified doc. How can I save the modified document?

const run = async () => {
  const loading = pdfjs.getDocument('form-cms1500.pdf')
  const pdf = await loading.promise
  const fields = await pdf.getFieldObjects()

  console.log(fields['employment.employment'] )
  fields['employment.employment'][0].value = 'On'
  console.log(fields['employment.employment'] )
  await fs.writeFileSync('test.pdf', await pdf.saveDocument()) // saveDocument throws this Warning: saveDocument called while `annotationStorage` is empty, please use the getData-method instead.

}

2

Answers


  1. Chosen as BEST ANSWER

    Ok I solved it! And by I solved it, I mean, I found a github issue where someone had a very similar problem in pdf-lib and adapted what I learned from that issue to resolve mine. Essentially you can grab the requisite field from the lower level acroForm API.

    Ultimate solution:

      const bytes = fs.readFileSync('form-cms1500.pdf')
      const pdfDoc = await PDFDocument.load(bytes);
      const form = pdfDoc.getForm();
      const allfields = form.acroForm.getAllFields()
      allfields[32][0].setValue(allfields[32][0].getOnValue())
      const pdfBytes = await pdfDoc.save()
      await fs.writeFileSync('test.pdf', pdfBytes)
    

    where field #32 is the checkbox I wanted to mark off. I figured out it was #32 just by printing the field names and their indexes.

    It turns out pdf.js is extremely unfriendly to updating fields so the best bet is just not to use it for that.


  2. I’m also trying to use PDF.js to fill in PDF forms (but they are government forms that use XFA, so you have to modify the XML data as opposed to the PDF elements), so I hope this question generates more replies.

    I did find a blog post on Mozilla talking about PDF.js began supporting PDF files with XFA in October of 2021 (https://blog.mozilla.org/attack-and-defense/2021/10/14/implementing-form-filling-and-accessibility-in-the-firefox-pdf-viewer/).

    One of the developers, Brendan Dahl, who wrote this post, has also published some PDF.js utilities — one that displays the structure of a PDF, another that displays the structure of PDF with XFA, and a third (maker) that modifies and a PDF to use the font in another PDF and save it (https://github.com/brendandahl/pdf.js.utils/blob/master/README.md). It looks like there is code to save the modified PDF document in the function create(data, fontRef, content) in file maker.js. The relevant code for saving in this example seems to be:

    // line 55
    var maker = (function () {
    
      function PDFOut() {
        this.output = '';
      }
      PDFOut.prototype = {
        write: function (data) {
          this.output += data;
        }
      };
    
    // line 224
     function create(data, fontRef, content) {
    
    // build PDF header, body and trailer
    
    // line 284
        var out = new PDFOut();
        createHeader(out);
        createBody(catalogRef.ref, refManager, out);
        var xrefOffset = createXref(refManager, out);
        createTrailer(refManager.offsetCount, catalogRef.ref, xrefOffset, out);
    
        return out.output;
      }
    

    I hope this is helpful as a starting point if nobody else jumps in with a better answer. Please let us know how it goes for you.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search