Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I created a sample PDF form with one image field. I'm trying to set an image to the field using PDFBox.

I see that PDFBox treats such field as an instance of PDPushButton but I don't see this class' interface exposes methods to deal with images...

The sample PDF can be downloaded using the URL in comment.

How can it be done?


EDIT:

Here is what I'm doing so far:

PDDocument pdfDocument = null;
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null) {
    PDPushButton field = (PDPushButton) acroForm.getField("test");

    PDImageXObject pdImageXObject = PDImageXObject.createFromFile("my_img.png", pdfDocument);

    List<PDAnnotationWidget> widgets = field.getWidgets();

    /*
     * The field may appear multiple times in the document, I would like to repeat that for every widget (occurence).
     */
    for(PDAnnotationWidget widget : widgets) {
        PDRectangle rectangle = widget.getRectangle();

        //PDAppearanceDictionary appearanceDict = widget.getAppearance();
        /*
         * In my case, when the image is not set with Acrobat DC, appearanceDict is null.
         */

        /*
         * Create the appearance stream and fill it with the image.
         */
        PDAppearanceStream pdAppearanceStream = new PDAppearanceStream(pdfDocument);
        pdAppearanceStream.setResources(new PDResources());
        try (PDPageContentStream pdPageContentStream = new PDPageContentStream(pdfDocument, pdAppearanceStream)) {
            pdPageContentStream.drawImage(pdImageXObject, rectangle.getLowerLeftX(), rectangle.getLowerLeftY(), pdImageXObject.getWidth(), pdImageXObject.getHeight());
        }
        pdAppearanceStream.setBBox(new PDRectangle(rectangle.getWidth(), rectangle.getHeight()));

        /*
         * Create the appearance dict with only one appearance (default) and set the appearance to the widget.
         */
        PDAppearanceDictionary appearanceDict = new PDAppearanceDictionary();
        appearanceDict.setNormalAppearance(pdAppearanceStream);
        widget.setAppearance(appearanceDict);
    }
}

ByteArrayOutputStream outStr = new ByteArrayOutputStream();
pdfDocument.save(outStr);
pdfDocument.close();

However, the generated PDF doesn't show any image with Acrobat Reader.

My goal is to start with this PDF and use PDFBox to get this PDF.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
557 views
Welcome To Ask or Share your Answers For Others

1 Answer

To start with, the pdf standard ISO 32000-2 does not specify "image fields" at all. Certain proprietary pdf generators / editors (Adobe products in particular) are using JavaScript to make push button fields to operate similar to form fields for images in a GUI, in particular in their own pdf viewers. Nonetheless those push buttons are push buttons, not image form fields. Thus,

PDFBox treats such field as an instance of PDPushButton but I don't see this class' interface exposes methods to deal with images...

If you really need something like an image field, though, there is no need to find a different solution to emulate image fields, one can simply follow Adobe's lead. One merely has to be aware that one only emulates image fields.

To fill the push button field with an image, one can use the AcroFormPopulator Renat Gatin presents in his answer to his own question "How to insert image programmatically in to AcroForm field using java PDFBox?".

Beware, though, applied to your sample file it reveals a bug in the PDFBox form flattening code. Thus, you should deactivate form flattening in the AcroFormPopulator, i.e. remove the acroForm.flatten() in it.

The bug in question is due to a missing transformation: If an XObject is used as appearance of a form field, everything in its bounding box by specification is automatically moved into the annotation rectangle:

1. The appearance’s bounding box (specified by its BBox entry) shall be transformed, using Matrix, to produce a quadrilateral with arbitrary orientation. The transformed appearance box is the smallest upright rectangle that encompasses this quadrilateral.

2. A matrix A shall be computed that scales and translates the transformed appearance box to align with the edges of the annotation’s rectangle (specified by the Rect entry). A maps the lower-left corner (the corner with the smallest x and y coordinates) and the upper-right corner (the corner with the greatest x and y coordinates) of the transformed appearance box to the corresponding corners of the annotation’s rectangle.

3. Matrix shall be concatenated with A to form a matrix AA that maps from the appearance’s coordinate system to the annotation’s rectangle in default user space

(ISO 32000-2, section 12.5.5 Appearance streams, Algorithm: appearance streams)

When said XObject after flattening is referenced from the page content, this transformation AA is not anymore determined and applied automatically by the viewer, so it has to be explicitly added by the form flattener.

Apparently PDFBox form flattening at least does not create that AA matrix, in particular it assumes the bounding box lower left always to be at the origin of the coordinate system.

For your example PDF this is not the case, so flattening here effectively moves the flattened image field button off-screen.

PS: The situation is even weirder than expected, PDAcroForm.flatten(List<PDField>, boolean) does try to determine whether translation or scaling of a former appearance XObject is necessary and adds a transformation if it thinks it to be required but

1. when checking the need for translation in PDAcroForm.resolveNeedsTranslation(PDAppearanceStream), it actually checks the form XObject resources of the appearance XObject; if and only if there is an XObject among them with a bounding box with neither anchor coordinate being 0, it is assumed that no translation is required. — This is a very weird test, a proper test would have to check the bounding box of the appearance XObject itself, not of form XObjects it contains. In the sample document the appearance XObject does not contain any form XObjects, so translation is automatically assumed to be required.

2. when adding a translation transformation, it again ignores the bounding box of the the appearance XObject and translates as if the appearance XObject anchor was at the coordinate system origin. — In the sample document this is completely inadequate as the bounding box anchor is already located that far out, causing it to be relocated to twice the required distance from the origin.

3. when checking the need for translation in PDAcroForm.resolveNeedsScaling(PDAppearanceStream), it actually checks for the presence of contained arbitrary XObjects and assumes that scaling is required if there is such a contained XObject. — In the sample document there is an image XObject, so scaling is assumed to be necessary... weird.

These 3 details make no sense at all. (Well, there may have been some sample documents for which they by chance give rise to the desired result, but in general this is nonsense.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...