You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using pymudf4llm actively for my RAG based product so for it did not create any issues for the pdf but for a single pdf to_markdown stucks on a page of the pdf and process runs continuously I will share the pdf and my code below
Note: this is the same issue as #215 and is solved by PR #216 . It is simply very slow - if you keep it running long enough it will finish. Once PR #216 has been merged, it will finish much sooner.
FYI this occurs when the array path_rects in pymupdf4llm/pymupdf4llm/helpers/multi_column.py becomes very large. For your PDF, the length of this array per page:
I am using pymudf4llm actively for my RAG based product so for it did not create any issues for the pdf but for a single pdf to_markdown stucks on a page of the pdf and process runs continuously I will share the pdf and my code below
Multiple-Input Variational Auto-Encoder.pdf
my code:
markdown_pages = pymupdf4llm.to_markdown("docs\domain1\Multiple-Input Variational Auto-Encoder.pdf",margins=0)
process stucks at the page 12 and my pymupdf version is 0.0.17
The text was updated successfully, but these errors were encountered: