-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification Request: Cross-Reference Stream Behavior in ISO 32000-2:2020 #500
Comments
(Answering as I read the specification; may differ from an 'official' reading.)
Please remember that cross-reference streams merely are an alternative way to store cross-reference information with the option to store new object types. Thus, the requirements in the cross-reference table section that are not specific to the structure of the xref table apply to cross-reference streams as well. And there you find:
Thus, if you have free objects numbers in the object number range, you need to have free object number entries, even if you use cross-reference streams. In particular, you cannot simply omit them altogether. Ok, that was the strict answer...
No, no technical issues, but nonetheless a violation of the spec which a PDF processor need not accept. While the above said is true - cross reference entries for all object numbers in the object number range of the document are required by the spec -, there are some PDF producers that create sparse cross references, i.e. omit free object entries. Thus, any PDF processor that wants to be able to process real world PDFs, must somehow be able to handle such sparse tables, usually by assuming missing entries to represent free object numbers or at least by failing gently, rejecting the document as broken.
Here I think that the prohibition of |
@mkl-public Regarding Question 1: I now understand that I missed the following important point:
This clears up why Type 0 entries for free objects cannot be omitted, as they are required by the broader cross-reference table rules. For Question 2: I understand now that the second field in the W array cannot be omitted. However, I am still unclear on one point: Thank you again for your assistance and insights. I greatly appreciate your expertise! |
Well, as the second field in the W array cannot be omitted, there always is a value for the field 2 in type 1 entries. Thus, in valid PDFs the default never kicks in. One can of course wonder about invalid PDFs. But in that case we are essentially talking about repair strategies which are not the topic of the PDF specification (at least not in case of the cross references tables and streams) |
I would like to clarify the following point once more: The specification states: Based on these statements, can we conclude that the default value of 0 for Field 1 of Type 0 (as outlined in Table 18 — Entries in a cross-reference stream) is unnecessary? If so, would it not be more appropriate to remove the unnecessary default value of 0 from the specification? Or am I misunderstanding something? |
That's also how I perceive this. @petervwyatt I think you can consider the removal of that unnecessary default as proposed solution of this issue. |
Proposed solution as per above: delete "Default value: 0" from Table 18 for Type=1, Field=2 Description cell. |
I am reaching out to request clarification regarding certain behaviors of cross-reference streams in the PDF specification, as detailed in ISO 32000-2 Second edition, Section 7.5.8.2 Specifically, I am investigating the W array (Table 17; Page:67) in cross-reference stream dictionaries. Below are the detailed scenarios and questions for which clarification would be greatly appreciated.
1. Default Behavior for Type Field Omission
According to the specification, "If the first element [of the W array] is zero, the type field shall not be present, and shall default to Type 1." My understanding is that this behavior eliminates the need to explicitly store Type 1 objects, which represent regular, non-compressed objects.
2. Prohibition of Zero for the Second Element in the W Array
The specification states, "A value of zero shall not be used for the second element of the W array." The second element represents the byte offset for both Type 1 and Type 2 entries. This requirement ensures that all entries provide a valid offset for locating the object or its containing stream.
However, Field 2 of Type 1 (offset) includes "default value: 0," which seems contradictory to the prohibition.
We greatly value the expertise and insights of the PDF Association and would deeply appreciate any clarifications or references to further documentation that might address these points.
The text was updated successfully, but these errors were encountered: