Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue Adding Text to Imported PDF Pages Using PdfPig's PdfDocumentBuilder - PdfDocumentFormatException #950

Open
darbid opened this issue Dec 7, 2024 · 0 comments

Comments

@darbid
Copy link

darbid commented Dec 7, 2024

I'm encountering an issue with PdfPig while trying to add text to pages imported from an existing PDF document. Here's the code I'm using:

using var rawDocument = PdfDocument.Open(path);  
using var document = PdfDocument.Open(Augment(rawDocument));  
   
private byte[] Augment(PdfDocument document)  
{  
    PdfDocumentBuilder builder = new();  
    var font = builder.AddStandard14Font(Standard14Font.Helvetica);  
  
    for (int i = 1; i <= document.NumberOfPages; i++)  
    {  
        var page = document.GetPage(i);  
        var images = page.GetImages();  
        var pageBuilder = builder.AddPage(document, i);  
        foreach (var image in images)  
        {  
            var point = new PdfPoint(image.Bounds.BottomLeft.X, (image.Bounds.TopLeft.Y + image.Bounds.BottomLeft.Y) / 2);  
            // var imageIndex = AddImage(image);  
            pageBuilder.AddText($"<<image-.png>>", 8, point, font);  
        }  
    }  
    byte[] fileBytes = builder.Build();  
    return fileBytes;  
}  

Context:

  • I'm using builder.AddPage(document, i) to import pages from an existing PDF into a new PdfDocumentBuilder.
  • My goal is to add text annotations near images on the pages.
  • The AddImage method is not relevant to the issue (it's commented out).

Problem:

When I run this code, I get the following PdfDocumentFormatException errors when attempting to open the augmented PDF:

UglyToad.PdfPig.Core.PdfDocumentFormatException: 'Could not find the object number 50 0 with type StreamToken instead, it was found with type ObjectToken.'  

The exception occurs in the BasePageFactory class, specifically in the following method:

public TPage Create(int number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)  
{  
    // ...  
    var contentStream = DirectObjectFinder.Get<StreamToken>(obj, PdfScanner);  
    // ...  
}  
   
public static T? Get<T>(IndirectReference reference, IPdfTokenScanner scanner) where T : class, IToken  
{  
    var temp = scanner.Get(reference);  
    if (temp is null || temp.Data is NullToken)  
    {  
        return null;  
    }  
  
    if (temp.Data is T locatedResult)  
    {  
        return locatedResult;  
    }  
  
    if (temp.Data is IndirectReferenceToken nestedReference)  
    {  
        return Get<T>(nestedReference, scanner);  
    }  
  
    if (temp.Data is ArrayToken array && array.Data.Count == 1)  
    {  
        var arrayElement = array.Data[0];  
  
        if (arrayElement is IndirectReferenceToken arrayReference)  
        {  
            return Get<T>(arrayReference, scanner);  
        }  
  
        if (arrayElement is T arrayToken)  
        {  
            return arrayToken;  
        }  
    }  
  
    throw new PdfDocumentFormatException($"Could not find the object number {reference} with type {typeof(T).Name} instead, it was found with type {temp.GetType().Name}.");  
}  

Observations:

  • If I remove the pageBuilder.AddText line, the code executes without errors, and the augmented PDF opens correctly.
  • This suggests that adding text to the imported pages is causing the issue.
  • This only happens on rare pdfs. Most of the time it is ok. I can give you an example pdf where this fails but would prefer to do it directly with you and not post it here.

Question:

Is it possible to use PdfPig's PdfDocumentBuilder to add text to pages imported from an existing PDF without encountering the PdfDocumentFormatException error? If so, how can I modify my code to achieve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant