Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table 151 Outline Item dictionary Dest entry supports dictionary value? #509

Closed
crlf0710 opened this issue Dec 26, 2024 · 6 comments
Closed
Assignees
Labels
wontfix This issue did not result in any spec changes

Comments

@crlf0710
Copy link

Describe the bug
I found that quite a few well-known implementations support dictionary-value at Table 151 Dest entry, if the dictionary provides a valid D entry similar to what named destinations can support.

I wonder if this is intentional?

Provide a recommendation for correction:

Maybe also allow dictionary values at Table 151 Dest entry?

@crlf0710 crlf0710 added the bug Something isn't correct label Dec 26, 2024
@petervwyatt petervwyatt changed the title Table 151 Dest entry supports dictionary value? Table 151 Outline Item dictionary Dest entry supports dictionary value? Jan 2, 2025
@petervwyatt petervwyatt added question Further information is requested enhancement New feature or request and removed bug Something isn't correct labels Jan 2, 2025
@petervwyatt
Copy link
Member

Table 151 = Entries in an outline item dictionary.

The only mention of a dictionary anywhere in the 12.3.2 subclauses is under Named Destinations in the paragraph above NOTE 2:

... a dictionary with a D entry whose value is such an array. In PDF 2.0, this dictionary may also optionally contain an SD entry.

I also checked various Adobe legacy PDF reference documents and none of them make any mention of such an extension in their "Compatibility and Implementation Notes" appendix. I also checked Adobe DVA and that does NOT permit a dictionary (only name, string and array as per ISO 32000).

Please provide example PDF files, creator information, and a definition of what the dictionary in question is. Otherwise it is likely considered as an invalid PDF and thus beyond the scope of ISO 32000 - processors can then handle via proprietary fixup code and might recognise that they assume they can support.

@crlf0710
Copy link
Author

crlf0710 commented Jan 9, 2025

Please provide example PDF files, creator information, and a definition of what the dictionary in question is.

Ok, disclaimer: Following descriptions are meant to provide one single case to describe the motivation here, not meant to criticize any one.

Example file is retrievable at https://github.com/aspose-pdf/Aspose.PDF-for-.NET/blob/28db0323e072bbc4464159093f32ee325ba4ede5/Examples/Data/AsposePDF/Bookmarks/GetChildBookmarks.pdf

Creator information i think it's Aspose PDF for dotnet sdk. Specific to this file, it's created by ver 8.0.0.

It has this object as its first outline entry:

78 0 obj
<</Title(\376\377\000P\000a\000r\000e\000n\000t\000 \000O\000u\000t\000l\000i\000n\000e)/Parent 80 0 R/Last 79 0 R/Dest<</S/GoTo/D[1/Fit]>>/Count -1/First 79 0 R/F 3>>
endobj

We could see /Dest entry here is a dictionary. If this dict is defined within the named tree in names dictionary and referenced as a named destination here, it'll be fine.

The value of this entry shall be a dictionary in which each key is a destination name and the corresponding > value is either an array defining the destination, using the syntax shown in "Table 149 — Destination
syntax", or a dictionary with a D entry whose value is such an array and may optionally contain an SD entry > as defined in "Table 201 — Action types".

But in this file the dictionary is referenced as Dest value directly from the outline entry, which is inconsistent with ISO 32000 description.

The problem is that it seems many implementations (Acrobat included) have been silently supporting this form, and I fear there has already been a large portion of PDF documents exercising this. At this point changing the spec itself might be a better choice for compatibility concerns, as it matches the status quo better.

@mkl-public
Copy link

The problem is that it seems many implementations (Acrobat included) have been silently supporting this form, and I fear there has already been a large portion of PDF documents exercising this. At this point changing the spec itself might be a better choice for compatibility concerns, as it matches the status quo better.

There are many invalid structures according to spec that most PDF viewers accept without any immediate problem. Nonetheless, I don't think one should update the spec to allow all of them.

On one hand this likely would result in a hodgepodge specification with more exceptions than rules and without any structure.

On the other hand, are you sure those many implementations (Acrobat included) really completely support that structure? Or do they only support it in some use cases and not in others?

To give an example of what I mean, the file you link to also has an error in the cross reference table. Normally neither Acrobat nor most other implementations complain about this. But if you sign this file twice (both times in incremental updates), Acrobat will start claiming the first signature is invalid, and this is caused by that cross reference table issue.

So if one starts to allow currently forbidden structures because apparently many implementations (Acrobat included) have been silently supporting them, then all those implementations would have to run many tests covering all kinds of work flow to check if they really, completely support those structures.


Nonetheless, one can of course consider to allow certain such structures, but the reason in each case should be better than that many PDF processors apparently support it.

@petervwyatt
Copy link
Member

Adobe Acrobat Pro DC preflight "Syntax Check" fails this PDF for this specific reason. The PDF also cannot be saved as PDF/A nor does Acrobat ask to save the file on exit, so Acrobat is not auto-repairing this PDF. It has a broken cross-reference table as @mkl-public indicates so this is highly contrary to what Acrobat normally does - there are thus 2 (and possibly more) issues at play.

This doesn't make this syntax valid PDF — such things are independent business decisions for each vendor to make if they wish to add silent malformation support potentially at the expense of other functionality (e.g. repair on exit & save).

In the next PDF TWG meeting we can seek input from other implementers, but I would contend this is inappropriate for extending the definition of what is a valid PDF that has existed for many years. It is simply a malformed PDF and out-of-scope for ISO 32000.

@petervwyatt petervwyatt added proposed solution Proposed solution is ready for review and removed question Further information is requested labels Jan 10, 2025
@petervwyatt petervwyatt self-assigned this Jan 10, 2025
@johnwhitington
Copy link

johnwhitington commented Jan 10, 2025

The fact that many PDF-reading programs support this is probably simply no more than an accident of pattern-matching: when reading a destination, if we find a dictionary, we treat it as a destination. If we find a string, we treat it as a key in the /Dests name tree and, upon retrieving the value, run the same destination-reading function again on that. It's simply the natural implementation.

I don't see any need for a change, especially in an era when formal PDF models and verifiers are coming into their own.

@petervwyatt
Copy link
Member

PDF TWG agree that this should not be standardised.

@petervwyatt petervwyatt added wontfix This issue did not result in any spec changes and removed enhancement New feature or request proposed solution Proposed solution is ready for review labels Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This issue did not result in any spec changes
Projects
None yet
Development

No branches or pull requests

4 participants