Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidOperationException calling GetPages() / GetPage() #973

Open
strivitech opened this issue Jan 16, 2025 · 2 comments
Open

InvalidOperationException calling GetPages() / GetPage() #973

strivitech opened this issue Jan 16, 2025 · 2 comments

Comments

@strivitech
Copy link

strivitech commented Jan 16, 2025

PdfPig throws an exception when iterating to the last page of my PDFs using foreach (var page in pdfDocument.GetPages()). The same issue occurs when directly attempting to retrieve the last page with var page = pdfDocument.GetPage(lastPageNumber).

I am not an expert in PDF internals, but this might be due to invalid operators within these PDFs. Despite this, the PDFs open without any issues in web browsers (Microsoft Edge or Chrome) and can be parsed successfully using other libraries like Aspose.PDF, IronPDF, or iTextSharp.LGPLv2.Core.

Code:

using var pdfDocument = PdfDocument.Open(pdfPath);
foreach (var page in pdfDocument.GetPages()) // here is the exception on last iteration
{
    textBuilder.AppendLine(page.Text);
}

Exception:

Unhandled exception. System.InvalidOperationException: Cannot execute a pop of the graphics state stack, it would leave the stack empty.
   at UglyToad.PdfPig.Graphics.Operations.SpecialGraphicsState.Pop.Run(IOperationContext operationContext)
   at UglyToad.PdfPig.Graphics.BaseStreamProcessor`1.ProcessOperations(IReadOnlyList`1 operations)
   at UglyToad.PdfPig.Graphics.ContentStreamProcessor.Process(Int32 pageNumberCurrent, IReadOnlyList`1 operations)
   at UglyToad.PdfPig.Parser.PageFactory.ProcessPage(Int32 pageNumber, DictionaryToken dictionary, NamedDestinations namedDestinations, MediaBox mediaBox, CropBox cropBox, UserSpaceUnit userSpaceUnit, PageRotationDegrees rotation, TransformationMatrix initialMatrix, IReadOnlyList`1 operations)
   at UglyToad.PdfPig.Content.BasePageFactory`1.ProcessPageInternal(Int32 pageNumber, DictionaryToken dictionary, NamedDestinations namedDestinations, Medi
aBox mediaBox, CropBox cropBox, UserSpaceUnit userSpaceUnit, PageRotationDegrees rotation, TransformationMatrix& initialMatrix, ReadOnlyMemory`1 contentBytes)
   at UglyToad.PdfPig.Content.BasePageFactory`1.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)
   at UglyToad.PdfPig.Content.Pages.GetPage[TPage](IPageFactory`1 pageFactory, Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
   at UglyToad.PdfPig.Content.Pages.GetPage(Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
   at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
   at UglyToad.PdfPig.PdfDocument.GetPages()+MoveNext()
   at Program.<Main>$(String[] args) in E:\Projects\Temporary\ConsoleApp10\ConsoleApp10\Program.cs:line 15

GhostScript message for the pdf:

Page 2

The following warnings were encountered at least once while processing this file:
        encountered more Q than q
        encountered more q than Q
        invalid operator used in text block

   **** This file had errors that were repaired or ignored.
@BobLd
Copy link
Collaborator

BobLd commented Jan 18, 2025

@strivitech thanks for creating the issue. Can you share a sample pdf document? Without the document, it is going to be tricky to help you.

@strivitech
Copy link
Author

@strivitech thanks for creating the issue. Can you share a sample pdf document? Without the document, it is going to be tricky to help you.

@BobLd Thanks for your reply! Here's a sample PDF document for your reference. Let me know if you need anything else.

JD5008.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants