• Tired of filling a PDF report every day? OR
  • Tired of collecting information from a pdf everyday?

Let’s ease your pain! This post will help you fill and retrieve data from pdf file which has form fields.

We will separate the article in 4 sections:

    • Preparing our Application
    • Extracting the PDF Form Field names
    • Filling PDF Form Fields
    • Retrieving data from PDF Form Fields

 


Preparing our Application


Create a new application and add Buttons, TextBoxes and 1 OpenFileDialog control as shown below.

Create an Excel file. Add data to it as shown below and save it as C:\PDF_FORM_DATA.xlsx.

At the end of the post, I have included all the files. Included in that is also the pdf file which I specifically created for this demonstration. Save the file as C:\Sample.Pdf. The pdf file looks like this

Next add reference to Excel Object Library. See this link for more information. We will also add a reference to itextsharp.dll.

iText is a PDF library that allows you to CREATE, ADAPT, INSPECT and MAINTAIN documents in the Portable Document Format (PDF). itextsharp.dll is freely available on the web and it is free to use for Non Commercial Applications. I have included this dll in the downloadable project below. Download the file and save it at a location of your choice. Once downloaded, click on the menu Project | Add Reference. Navigate to the Browse tab and select the DLL and click on OK.

Paste this code at the top of the project.

Imports iTextSharp
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports iTextSharp.text.xml
Imports System.IO
Imports Excel = Microsoft.Office.Interop.Excel

Public Class Form1
   
End Class

Double click on the Browse button which we will use to select the PDF template file and paste the following code:

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        With OpenFileDialog1
            .DefaultExt = ".PDF"
            .DereferenceLinks = True
            .Filter = "PDF files (*.PDF)|*.PDF"
            .Multiselect = False
            .Title = "Select a PDF file to open"
            .ValidateNames = True

            If .ShowDialog = Windows.Forms.DialogResult.OK Then
                Try
                    TextBox1.Text = .FileName
                Catch fileException As Exception
                    Throw fileException
                End Try
            End If
        End With
    End Sub

Double click on the Browse button which we will use to select the Excel file and paste this code:

    Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
        With OpenFileDialog1
            .DefaultExt = ".XLSX"
            .DereferenceLinks = True
            .Filter = "Excel files (*.XLSX)|*.XLSX"
            .Multiselect = False
            .Title = "Select an Excel file to open"
            .ValidateNames = True

            If .ShowDialog = Windows.Forms.DialogResult.OK Then
                Try
                    TextBox2.Text = .FileName
                Catch fileException As Exception
                    Throw fileException
                End Try
            End If
        End With
    End Sub

Double click on the last Browse button which we will use to select the pdf file to read data from and paste this code:

    Private Sub Button5_Click(sender As Object, e As EventArgs) Handles Button5.Click
        With OpenFileDialog1
            .DefaultExt = ".PDF"
            .DereferenceLinks = True
            .Filter = "PDF files (*.PDF)|*.PDF"
            .Multiselect = False
            .Title = "Select a PDF file to open"
            .ValidateNames = True

            If .ShowDialog = Windows.Forms.DialogResult.OK Then
                Try
                    TextBox7.Text = .FileName
                Catch fileException As Exception
                    Throw fileException
                End Try
            End If
        End With
    End Sub

 


Extracting the PDF Form Field names


Before we can write to a pdf form field, we need to know what are the names of the form field.

Double click on the Get PDF Fields button and paste this code:

    Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click
        Dim pdfTemplate As String = TextBox1.Text
        Dim readerPDF As New PdfReader(pdfTemplate)
        Dim PDFfld As Object

        For Each PDFfld In readerPDF.AcroFields.Fields
            If TextBox3.Text = "" Then
                TextBox3.Text = PDFfld.key.ToString()
            Else
                TextBox3.Text = TextBox3.Text & Environment.NewLine & PDFfld.key.ToString()
            End If
        Next

        TextBox3.SelectionStart = 0
    End Sub

Run the application and select the PDF template. Click on the Get PDF Fields button. The above code reads the pdf file and gets the form field names. We will use these names to write to them. When you click on Get PDF Fields button, the form field names are populated in the textbox below that button.


Filling PDF Form Fields


Double click on the Update PDF Fields button and paste this code:

    Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click
        Dim pdfTemplate As String = TextBox1.Text

        '~~> Define your Excel Objects
        Dim xlApp As New Excel.Application
        Dim xlWorkBook As Excel.Workbook
        Dim xlWorkSheet As Excel.Worksheet

        '~~> Show/Hide Excel
        xlApp.Visible = True

        '~~> Opens an exisiting Workbook
        xlWorkBook = xlApp.Workbooks.Open(TextBox2.Text)

        '~~> Set the relevant sheet that we want to work with
        xlWorkSheet = xlWorkBook.Sheets("Sheet1")

        With xlWorkSheet
            Dim lastrow As Integer = .Range("A" & .Rows.Count).End(Excel.XlDirection.xlUp).Row

            For i As Integer = 2 To lastrow
                Dim name As String = .Range("A" & i).Value & " " &
                                     .Range("B" & i).Value & " " &
                                     .Range("C" & i).Value

                '~~> Change the Output FileName here
                Dim PDFUpdatedFile As String = "C:\" & name & ".pdf"

                Dim readerPDF As New PdfReader(pdfTemplate)
                Dim stamperPDF As New PdfStamper(readerPDF,
                New FileStream(PDFUpdatedFile, FileMode.Create))

                Dim pdfFormFields As AcroFields = stamperPDF.AcroFields

                Dim interviewDate As String = .Range("D" & i).Text
                Dim interviewTime As String = .Range("E" & i).Text

                '~~> Update pdf FormFields
                pdfFormFields.SetField(TextBox4.Text, name)
                pdfFormFields.SetField(TextBox5.Text, interviewDate)
                pdfFormFields.SetField(TextBox6.Text, interviewTime)

                '~~> To remove editting options from the output Form, set it to FALSE
                '~~> To leave then editting options open in the output Form, set it to TRUE
                stamperPDF.FormFlattening = True

                '~~> close the pdf
                stamperPDF.Close()
            Next
        End With

        '~~> Close the Excel file without saving
        xlWorkBook.Close(False)
        '~~> Quit the Excel Application
        xlApp.Quit()

        '~~> Clean Up
        releaseObject(xlApp)
        releaseObject(xlWorkBook)
    End Sub

Also paste this code for clean up

 
    '~~> Release the objects
    Private Sub releaseObject(ByVal obj As Object)
        Try
            System.Runtime.InteropServices.Marshal.ReleaseComObject(obj)
            obj = Nothing
        Catch ex As Exception
            obj = Nothing
        Finally
            GC.Collect()
        End Try
    End Sub

Run the application and click on the Get PDF Fields button. Now we know what are the names of the form fields. Let’s use that to fill the pdf. Copy those names into the text box on the right.

Click on the Update PDF Fields button. The code will loop through your records in the excel file and generate the pdf files. In our case we will get 2 pdf files as there are only 2 records in the excel file. If you open the pdf, you will see the data has been updated.


Retrieving data from PDF Form Fields


Open of the Sample.Pdf which has the form fields and enter some data in the form fields. Finally save it as C:\Sample With Data.Pdf. I entered some data and it looks like this.

Double click on the Get Data button and paste this code:

 
    Private Sub Button6_Click(sender As Object, e As EventArgs) Handles Button6.Click
        Dim pdfTemplate As String = TextBox7.Text
        Dim readerPDF As New PdfReader(pdfTemplate)
        Dim pdfFormFields As AcroFields = readerPDF.AcroFields

        Dim name As String = pdfFormFields.GetField("Applicant")
        Dim dateOfInterview As String = pdfFormFields.GetField("Date")
        Dim timeOfInterview As String = pdfFormFields.GetField("Time")

        MessageBox.Show("Application : " & name & vbNewLine &
                        "Date of Interview : " & dateOfInterview & vbNewLine &
                        "Time of Interview : " & timeOfInterview)
    End Sub

Click on the Browse Button and select the C:\Sample With Data.Pdf. Click on Get Data button and you will get a pop up with the relevant data.


You may download the sample files from this link.