Reading PDF Form Fields with VBA

I’ve written about VBA and Acrobat JavaScript before, and I’ve also mentioned that you can combine VBA and JavaScript to access PDF form fields, but I still owe a sample for that. I had to answer another question today about how to exactly do that, so I whipped up a quick sample program that demonstrates the use of the JavaScript Object (JSO) to read and write AcroForm fields.

We start the same way as in my old VBA sample to create a VBA program that references the Acrobat TLB and to add a button to a document. When we now use the following script as the button handler, we can work with form fields:

Private Sub CommandButton1_Click()
    Dim AcroApp As Acrobat.CAcroApp
    Dim theForm As Acrobat.CAcroPDDoc
    Dim jso As Object
    Dim text1, text2 As String

    Set AcroApp = CreateObject("AcroExch.App")
    Set theForm = CreateObject("AcroExch.PDDoc")
    theForm.Open ("C:\temp\sampleForm.pdf")
    Set jso = theForm.GetJSObject

    ' get the information from the form fields Text1 and Text2
    text1 = jso.getField("Text1").Value
    text2 = jso.getField("Text2").Value

    MsgBox "Values read from PDF: " & text1 & " " & text2

    ' set a text field
    Dim field2 As Object
    Set field2 = jso.getField("Text2")

    field2.Value = 13   ' assign the number 13 to the fields value

    ' get the information from the form fields Text1 and Text2
    text1 = jso.getField("Text1").Value
    text2 = jso.getField("Text2").Value

    MsgBox "Values read from PDF: " & text1 & " " & text2

    theForm.Close

    AcroApp.Exit
    Set AcroApp = Nothing
    Set theForm = Nothing

    MsgBox "Done"
End Sub

This program requires a PDF file with text fields called “Text1″ and “Text2″ to be stored as C:\temp\sampleForm.pdf. With the explanation in the previous two blog posts, it should not be hard to understand what’s going on here. The only new command introduced is the getField() function, which returns a form field. The form field object has a property “value” which contains the actual value that’s assigned to the field. Give it a try and let me know how it works for you. The updated form field is not saved (because the document does not get saved) – I’ll leave that up to the reader to figure out.

Also, this program will not work with XFA forms (the ones you create in Designer). For those, you need to use the XFA DOM to access the form data. For anybody interested in XFA forms, the LifeCycle Designer ES Scripting Reference is a must read.

This entry was posted in Acrobat, JavaScript, PDF, Programming and tagged , , , , . Bookmark the permalink.

14 Responses to Reading PDF Form Fields with VBA

  1. Wayne says:

    Hi there,

    This is exactly what I am looking for. Thanks so much.
    However, my assignments are a bit different. I am wondering if you can give me a hand.

    The data source:
    1. An excel spreadsheet storing the raw data
    2. A PDF file with an interactive form used to store the data input by the user according to the above excel spreadsheet

    My assignments:
    1. input the raw data from excel spreadsheet to the PDF interactive form
    2. double check if the data input in the PDF interactive form is correct.

    I am not allowed to convert the Excel spreadsheet to the PDF file directly as the PDF file is the template with precise paragraphing and wording embedded. It is a heavy job when there are over hundreds of number. I am thinking if the excel VBA can do both assignments automatically or at least double check my input.

    Thanks.
    Wayne

  2. Wayne says:

    Hi khk,
    I have adobe 9.1 professional and excel 2007, adobe TLB added.

    F8 stepinto F8 found “Runtime error ‘424’, object required”
    debug at this code ” text1 = jso.getField(“Text1″).Value”

    don’t know why?

    Thanks
    Wayne

  3. Wayne says:

    oh, “t” not “T”, i got it.
    btw, we can use the call function to save the updates.
    call theForm.Save(PDSaveFull,”C:\temp\sampleForm.pdf).

  4. Matt Conklin says:

    How do you test to make sure that the field exists in the form. I don’t want an error message when the user selects the wrong .PDF form to import into the spreadsheet.

    “On Error goto 0″ does not trap the error.

    Is there a method such as FieldExists(“text1″) that I could use to trap that error before using the getField() method.

  5. admin says:

    I don’t know about VB (remember, I said that I am not working in VB). You can do test if a field exists in JavaScript by testing the value returned by getField():

    var f = this.getField("test");
    if (f == null)
    {
    app.alert("f does not exist");
    }

  6. Daryl says:

    In Excel 2010 VBA, I am trying to create a series of PDF forms each individually filled out but then combine them into a single PDF. I’ve tried a number of things but I keep getting stuck on the single PDF creation. Here’s what I currently have. The individual forms get created and saved with the correct data. However when I try to build the single PDF, it has the correct number of forms but the data is all filled in identically. This is the same problem that happens if you try to combine like forms into a single PDF in Adobe.

    In Adobe Acrobat you can avoid that problem by “flattening” the individual PDF files so that the form field names are removed and the data in the form cannot be edited. After that is completed, you can combine the individual PDF files and the field data is not overwritten. Is there a way to mimic that in Excel VBA? I actually do not even need to save the individual forms — only the single “combined” PDF. I only did that in this code because I thought it might help (it didn’t).

    I’d certainly appreciate any help or insight on this.

    Sub Process_Manual_Input()

    bProcessSuccessful = False

    sDefName = “ABC Fund – Forms 62 (xx.xx.xx).pdf”
    vFileName = Application.GetSaveAsFilename(sDefName, “PDF Files (*.pdf), *.pdf”)

    If vFileName = False Then
    End
    Else
    sPath = Left(vFileName, Len(vFileName) – 4)
    sFile = Right(vFileName, Len(vFileName) – InStrRev(vFileName, “\”))
    MkDir sPath
    End If

    Set oAcroApp = CreateObject(“AcroExch.App”)
    Set oEmbeddedPDF = CreateObject(“AcroExch.PDDoc”)
    Set oManipPDF = CreateObject(“AcroExch.PDDoc”)
    Set oTempDoc = CreateObject(“AcroExch.PDDoc”)
    Set oFullPDF = CreateObject(“AcroExch.PDDoc”)

    ‘ Open Form 62 template

    szO = oEmbeddedPDF.Open(wsRequest.Range(“File62″).Value)
    ‘ Set oManipAV = oEmbeddedPDF.OpenAVDoc(“Form 62″)

    With wsManual
    lRow = .Range(“HeadingRow”).Row + 1
    lEndRow = .UsedRange.Rows.Count
    lNumPages = 0
    lNumFullPages = 0

    Do Until lRow > lEndRow

    szO = oManipPDF.Create
    szO = oManipPDF.InsertPages(-1, oEmbeddedPDF, 0, 3, 1)

    Set oManipAV = oManipPDF.OpenAVDoc(“Form 62″)

    Set oAcroForm = CreateObject(“AFormAut.App”)
    Set oAcroFields = oAcroForm.Fields

    oAcroFields.Item(“sFundName62″).Value = sFName
    oAcroFields.Item(“sStreet62″).Value = sStreet
    oAcroFields.Item(“sCityStateZip62″).Value = sCitStZip
    oAcroFields.Item(“sEIN62″).Value = sEIN

    For Each oAcroField In oAcroFields
    oAcroField.IsReadOnly = True
    Next

    sSecName = Trim(Replace(.Cells(lRow, mSecName), “/”, “-“))

    s62FileName = sPath & “\(” & sSecName & _
    ” – ” & Trim(CStr(.Cells(lRow, mSecID))) & “) ” & sFile

    szO = oManipPDF.Save(1, s62FileName)

    oManipPDF.Close

    oTempDoc.Open (s62FileName)

    lNumPages = oTempDoc.GetNumPages()

    ‘ szO = oManipPDF.Save(1, sPath & “\(” & sSecName & _
    ‘ ” – ” & Trim(CStr(.Cells(lRow, mSecID))) & “) ” & sFile)

    If lNumFullPages > 0 Then
    szF = oFullPDF.InsertPages(lNumFullPages – 1, oTempDoc, 0, 3, 1)
    lNumFullPages = oFullPDF.GetNumPages()
    ‘ lNumFullPages = lNumFullPages + lNumPages
    ElseIf lNumFullPages = 0 Then
    Set oFullAV = oFullPDF.OpenAVDoc(“Full 62″)
    szF = oFullPDF.Create
    szF = oFullPDF.InsertPages(-1, oTempDoc, 0, 3, 1)
    lNumFullPages = oFullPDF.GetNumPages()
    ‘ lNumFullPages = lNumPages
    Else
    MsgBox “Error Writing to Full PDF”
    Exit Sub
    End If

    oTempDoc.Close

    lRow = lRow + 1
    Loop

    If oFullPDF.Save(1, sPath & “\” & sFName & ” 62s.pdf”) = False Then
    MsgBox “Save Failed”
    End If

    oFullPDF.Close
    bProcessSuccessful = True

    End With

    Exit Sub

    ER_Handler:
    MsgBox Error$
    On Error Resume Next
    ‘Resume
    oAcroApp.CloseAllDocs
    oAcroApp.Exit
    Set oAcroApp = Nothing
    Set oManipPDF = Nothing
    Set oEmbeddedPDF = Nothing
    Set oManipAV = Nothing
    wbManual.Close False
    Set wsManual = Nothing

    Application.ScreenUpdating = True

    End Sub

  7. Karl Heinz Kremer says:

    Yes, you can flatten with VBA – but you need to use the JSObject interface and then call jso.flattenPages(). Take a look at this post, which explains how the JSO gets used: http://khkonsulting.com/2009/03/acrobat-javascript-and-vb-walk-into-a-bar/

    Here is information about the flattenPages method: http://livedocs.adobe.com/acrobat_sdk/11/Acrobat11_HTMLHelp/JS_API_AcroJS.89.472.html

  8. Charles says:

    Does anybody know how to read the document restricions summary by excel vba?

    I get an error when certain document restictions are set, but I cannot figure out if they are set.

    For example, I am looking for how to read:
    “Printing” is “allowed”
    “Changing the document” is “allowed”
    and Document assembly, Content copying, content copying for accessibility, page extraction, commenting, filling the form fields, signing, creation of template pages

  9. jaime says:

    Using an excel macro, I need code to find text strings in the pdf and hilite the text or add comments/annotations to multiple strings that are listed in column A.

  10. Karl Heinz Kremer says:

    This is impossible to do in VBA, you would need to use the JSObject to do most of the work in JavaScript, and even there, it would be a very complex task.

  11. Pingback: Read Pdf fields

  12. Kelsey says:

    Hi, I am using VBA to add a combobox to a pdf document. Do you know of a way to add the properties to the combobox? I would appreciate any help!

    Dim App As CAcroApp
    Dim PDDoc As CAcroPDDoc
    Dim jso As Object
    Dim i As Long
    Dim FileName As String
    Dim field As Object
    Dim rect(3) As Integer

    Set field = jso.addField(“Performance”, “combobox”, 0, rect)

    Set jso = PDDoc.GetJSObject

    rect(0) = 182 ‘ x lower left
    rect(1) = 762 ‘ y lower left
    rect(2) = 297 ‘ x upper right
    rect(3) = 742 ‘ y upper right

    i = PDDoc.Save(PDSaveIncremental, FileName)
    End If
    PDDoc.Close

  13. Karl Heinz Kremer says:

    Kelsey,

    there are a number of problems with your code. You are using rect and so before they are defined. There is an “End If” without an “if”, and there are things missing at the beginning to open a PDDoc.

    Try something like this:

    Dim App As Acrobat.CAcroApp
    Dim PDDoc As Acrobat.CAcroPDDoc
    Dim jso As Object
    Dim i As Long
    Dim FileName As String
    Dim field As Object
    Dim rect(3) As Integer
    Dim items(3) As String

    App = CreateObject("AcroExch.App")
    PDDoc = CreateObject("AcroExch.PDDoc")
    PDDoc.Open("C:\temp\test.pdf")

    jso = PDDoc.GetJSObject

    rect(0) = 182 ' x lower left
    rect(1) = 762 ' y lower left
    rect(2) = 297 ' x upper right
    rect(3) = 742 ' y upper right

    field = jso.addField("Performance", "combobox", 0, rect)

    field.strokeColor = jso.color.blue
    field.fillCOlor = jso.color.yellow

    items(0) = "One"
    items(1) = "Two"
    items(2) = "Three"
    field.setItems(items)

    i = PDDoc.Save(1, "C:\temp\test-out.pdf")
    PDDoc.Close()

  14. Kelsey says:

    Thank you for your help!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>