Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support p7m signed PDFs #849

Open
r-Larch opened this issue Jun 15, 2024 · 0 comments
Open

Support p7m signed PDFs #849

r-Larch opened this issue Jun 15, 2024 · 0 comments
Labels
document-reading Related to reading documents enhancement

Comments

@r-Larch
Copy link

r-Larch commented Jun 15, 2024

Hi,

my customer has a lot of signed PDFs. These PDFs are signed with Pkcs#7 and have the extension file.pdf.p7m.
PdfPig can load some of these files, but not all.

To support all of them I use the following code:

// if file is p7m:
await using var fs = File.OpenRead(file);
var signedFile = new Org.BouncyCastle.Cms.CmsSignedData(fs);
using var ms = new MemoryStream();
signedFile.SignedContent.Write(ms);
ms.Position = 0;
var doc = PdfDocument.Open(ms);

Maybe it could be implemented in the lib as many other PDF libs seem to load these files without problems.

**Some Errors `PdfPig` Shows**
PdfP7M:  @"\pdf_p7m\Allegato 5 - Criteri valutazione tecnica.pdf.p7m"
 - System.NullReferenceException: Object reference not set to an instance of an object.
   at UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)
   at UglyToad.PdfPig.Content.PagesFactory.CheckIfIsPage(DictionaryToken nodeDictionary, IndirectReference parentReference, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing)
   at UglyToad.PdfPig.Content.PagesFactory.ProcessPagesNode(IndirectReference referenceInput, DictionaryToken nodeDictionaryInput, IndirectReference parentReferenceInput, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing, PageCounter pageNumber)
   at UglyToad.PdfPig.Content.PagesFactory.Create(IndirectReference pagesReference, DictionaryToken pagesDictionary, IPdfTokenScanner scanner, IPageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\AMSA. Appalto integrato - Disciplinare di gara 29.06.2022.pdf.p7m"
 - System.NullReferenceException: Object reference not set to an instance of an object.
   at UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)
   at UglyToad.PdfPig.Content.PagesFactory.CheckIfIsPage(DictionaryToken nodeDictionary, IndirectReference parentReference, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing)
   at UglyToad.PdfPig.Content.PagesFactory.ProcessPagesNode(IndirectReference referenceInput, DictionaryToken nodeDictionaryInput, IndirectReference parentReferenceInput, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing, PageCounter pageNumber)
   at UglyToad.PdfPig.Content.PagesFactory.Create(IndirectReference pagesReference, DictionaryToken pagesDictionary, IPdfTokenScanner scanner, IPageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\AMSA. Appalto integrato - lettera-di-invito.pdf.p7m"
 - System.NullReferenceException: Object reference not set to an instance of an object.
   at UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)
   at UglyToad.PdfPig.Content.PagesFactory.CheckIfIsPage(DictionaryToken nodeDictionary, IndirectReference parentReference, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing)
   at UglyToad.PdfPig.Content.PagesFactory.ProcessPagesNode(IndirectReference referenceInput, DictionaryToken nodeDictionaryInput, IndirectReference parentReferenceInput, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing, PageCounter pageNumber)
   at UglyToad.PdfPig.Content.PagesFactory.Create(IndirectReference pagesReference, DictionaryToken pagesDictionary, IPdfTokenScanner scanner, IPageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\AMSA. Appalto integrato. Capitolato speciale lavori.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The type of the catalog dictionary was not Catalog: <Type, /Cat\x04\x03�alog>, <Pages, 382 0>, <PageMode, /UseNone>, <OutputIntents, [ 504 0 ]>, <Outlines, 207 0>, <Lang, (en-US)>, <Metadata, 105 0>.
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\AMSA. Appalto integrato. Capitolato speciale prog. esecutivo.pdf.p7m"
 - System.NullReferenceException: Object reference not set to an instance of an object.
   at UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)
   at UglyToad.PdfPig.Content.PagesFactory.CheckIfIsPage(DictionaryToken nodeDictionary, IndirectReference parentReference, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing)
   at UglyToad.PdfPig.Content.PagesFactory.ProcessPagesNode(IndirectReference referenceInput, DictionaryToken nodeDictionaryInput, IndirectReference parentReferenceInput, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing, PageCounter pageNumber)
   at UglyToad.PdfPig.Content.PagesFactory.Create(IndirectReference pagesReference, DictionaryToken pagesDictionary, IPdfTokenScanner scanner, IPageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\PD.D.A.GEN.1040 - RELAZIONE TECNICA IMPIANTI.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The dictionary did not contain a number with the key /Size. Dictionary way: .
   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetInt(DictionaryToken token, NameToken name)
   at UglyToad.PdfPig.CrossReference.TrailerDictionary..ctor(DictionaryToken dictionary)
   at UglyToad.PdfPig.CrossReference.CrossReferenceTableBuilder.Build(Int64 firstCrossReferenceOffset, Int64 offsetCorrection, ILog log)
   at UglyToad.PdfPig.Parser.FileStructure.CrossReferenceParser.Parse(IInputBytes bytes, Boolean isLenientParsing, Int64 crossReferenceLocation, Int64 offsetCorrection, IPdfTokenScanner pdfScanner, ISeekableTokenScanner tokenScanner)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\PD.D.A.GEN.1041 - RELAZIONE CONSUMI ENERGETICI - EX LEGGE 10.91 .pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The dictionary did not contain a number with the key /Size. Dictionary way: .
   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetInt(DictionaryToken token, NameToken name)
   at UglyToad.PdfPig.CrossReference.TrailerDictionary..ctor(DictionaryToken dictionary)
   at UglyToad.PdfPig.CrossReference.CrossReferenceTableBuilder.Build(Int64 firstCrossReferenceOffset, Int64 offsetCorrection, ILog log)
   at UglyToad.PdfPig.Parser.FileStructure.CrossReferenceParser.Parse(IInputBytes bytes, Boolean isLenientParsing, Int64 crossReferenceLocation, Int64 offsetCorrection, IPdfTokenScanner pdfScanner, ISeekableTokenScanner tokenScanner)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\PD.D.A.GEN.1090 - RELAZIONE AUTORIZZAZIONE ALLO SCARICO.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The dictionary did not contain a number with the key /Size. Dictionary way: .
   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetInt(DictionaryToken token, NameToken name)
   at UglyToad.PdfPig.CrossReference.TrailerDictionary..ctor(DictionaryToken dictionary)
   at UglyToad.PdfPig.CrossReference.CrossReferenceTableBuilder.Build(Int64 firstCrossReferenceOffset, Int64 offsetCorrection, ILog log)
   at UglyToad.PdfPig.Parser.FileStructure.CrossReferenceParser.Parse(IInputBytes bytes, Boolean isLenientParsing, Int64 crossReferenceLocation, Int64 offsetCorrection, IPdfTokenScanner pdfScanner, ISeekableTokenScanner tokenScanner)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\PD.GEO.01 RELAZIONE GEOLOGICA GEOTECNICA E DI COMPATIBILITA_.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The dictionary did not contain a number with the key /Size. Dictionary way: .
   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetInt(DictionaryToken token, NameToken name)
   at UglyToad.PdfPig.CrossReference.TrailerDictionary..ctor(DictionaryToken dictionary)
   at UglyToad.PdfPig.CrossReference.CrossReferenceTableBuilder.Build(Int64 firstCrossReferenceOffset, Int64 offsetCorrection, ILog log)
   at UglyToad.PdfPig.Parser.FileStructure.CrossReferenceParser.Parse(IInputBytes bytes, Boolean isLenientParsing, Int64 crossReferenceLocation, Int64 offsetCorrection, IPdfTokenScanner pdfScanner, ISeekableTokenScanner tokenScanner)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\PD.GEO.02 ASSEVERAZIONE TIZIANA BAMPI.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: The dictionary did not contain a number with the key /Size. Dictionary way: .
   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetInt(DictionaryToken token, NameToken name)
   at UglyToad.PdfPig.CrossReference.TrailerDictionary..ctor(DictionaryToken dictionary)
   at UglyToad.PdfPig.CrossReference.CrossReferenceTableBuilder.Build(Int64 firstCrossReferenceOffset, Int64 offsetCorrection, ILog log)
   at UglyToad.PdfPig.Parser.FileStructure.CrossReferenceParser.Parse(IInputBytes bytes, Boolean isLenientParsing, Int64 crossReferenceLocation, Int64 offsetCorrection, IPdfTokenScanner pdfScanner, ISeekableTokenScanner tokenScanner)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209
PdfP7M:  @"\pdf_p7m\Relazione def.pdf.p7m"
 - UglyToad.PdfPig.Core.PdfDocumentFormatException: Could not find dictionary associated with reference in pages kids array: 35 0.
   at UglyToad.PdfPig.Content.PagesFactory.ProcessPagesNode(IndirectReference referenceInput, DictionaryToken nodeDictionaryInput, IndirectReference parentReferenceInput, Boolean isRoot, IPdfTokenScanner pdfTokenScanner, Boolean isLenientParsing, PageCounter pageNumber)
   at UglyToad.PdfPig.Content.PagesFactory.Create(IndirectReference pagesReference, DictionaryToken pagesDictionary, IPdfTokenScanner scanner, IPageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.CatalogFactory.Create(IndirectReference rootReference, DictionaryToken dictionary, IPdfTokenScanner scanner, PageFactory pageFactory, ILog log, Boolean isLenientParsing)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(Stream stream, ParsingOptions options)
   at UglyToad.PdfPig.PdfDocument.Open(Stream stream, ParsingOptions options)
   at JurisMatic.Files.Tests.FileDetectorTests.Find_bad_pdf() in C:\Projects\LarchSys\JurisMatic\Libraries\JurisMatic.Files.Tests\FileDetectorTests.cs:line 209

I could send you some of these files on a private channel.

Kind regards
René

@EliotJones EliotJones added enhancement document-reading Related to reading documents labels Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
document-reading Related to reading documents enhancement
Projects
None yet
Development

No branches or pull requests

2 participants