Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Using iTextSharp, how can I merge multiple PDFs into one PDF without losing the Form Fields and their properties in each individual PDF?

(I would prefer an example using streams from a database but file system is ok as well)

I found this code that works but it flattens out my PDFs so I can't use it.

UPDATE

@Mark Storer - This is the code I am using now based on your feedback (see below) but it gives me a corrupt document after the save. I tested each of the code parts separately and it seems to be failing in the MergePdfForms function shown below. I obviously don't want to use the renameFields part of your example because I need the field names to remain "as is".

Public Sub MergePdfForms(ByVal pdfFiles As ArrayList, ByVal outputPath As String)
    Dim ms As New IO.MemoryStream()
    Dim copier As New PdfCopyFields(ms)
    For Each pfile As String In pdfFiles
        Dim reader As New PdfReader(pfile)
        copier.AddDocument(reader)
    Next
    SaveMemoryStream(ms, outputPath)
    copier.Close()
End Sub

Public Sub SaveMemoryStream(ms As IO.MemoryStream, FileName As String)
    Dim outStream As IO.FileStream = IO.File.OpenWrite(FileName)
    ms.WriteTo(outStream)
    outStream.Flush()
    outStream.Close()
End Sub
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
856 views
Welcome To Ask or Share your Answers For Others

1 Answer

Fields in PDFs have an Unusual Property: All fields with the same name are the same field. They share a value. This is handy when the form refers to the same person and you have a nice naming scheme across forms. It's Not Handy when you want to put 20 instances of a single form into a single PDF.

This makes merging multiple forms challenging, to say the least. The most common option (thanks to iText), is to flatten the forms prior to merging them, at which point you're no long merging forms, and the problem Goes Away.

The other option is to rename your fields prior to merging them. This can make data extraction difficult later, can break scripts, and is generally a PITA. That's why flattening is so much more popular.

There's a class in iText called PdfCopyFields, and it will correctly copy fields from one document to another... it will also merge fields with the same name correctly, such that they really share a single value and Acrobat/Reader doesn't have to do a bunch of extra work on the file to get it that way before displaying it to a user.

However, PdfCopyFields will not rename fields for you. To do that, you need to get the AcroFields object from the PdfReader in question, and call renameField(String, String) on Each And Every Field prior to merging the documents with PdfCopyFields.

All this is for "AcroForm"-based PDF forms. If you're dealing with XFA forms (forms from LiveCycle Designer), all bets are off. You have to muck with the XML, A Lot.

And heaven help you if you have to combine forms from both.

So ass-u-me-ing that you're working with AcroForm fields, the code might look something like this (forgive my Java):

public void mergeForms(String outpath, String inPaths[]) {
  PdfCopyFields copier = new PdfCopyFields(new FileOutputStream(outpath) );
  for (String curInPath : inPaths) {
    PdfReader reader = new PdfReader(curInPath);
    renameFields(reader.getAcroFields());

    copier.addDocument(reader);
  }
  copier.close();
}

private static int counter = 0;
private void renameFields(AcroFields fields) {
  Set<String> fieldNames = fields.getFields().keySet();
  String prepend = String.format("_%d.", counter++);

  for(String fieldName : fieldNames) {
    fields.rename(fieldName, prepend + fieldName);
  }
}

Ideally, renameFields would also create a generic field object named prepend's-value and make all the other fields in the document it's children. This would make Acrobat/Reader's life easier and avoid an apparently unnecessary "save changes?" request when closing the resulting PDF from Acrobat.

Yes, that's why Acrobat will sometimes ask you to save changes when You Didn't Do Anything! Acrobat did something behind the scenes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...