Skip to content

page extract with pdf4llm.to_markdown not extracting the first line of the page in specific scenarios #196

Answered by JorjMcKie
leelaraj72 asked this question in Q&A
Discussion options

You must be logged in to vote

Try setting the margin parameter. Default is margins=(0, 50, 0, 50) which ignores stripes of height 50 at top and bottom.
Using margins=0 looks at the full page.

Replies: 1 comment 11 replies

Comment options

You must be logged in to vote
11 replies
@Fianax
Comment options

@Fianax
Comment options

@JorjMcKie
Comment options

@JorjMcKie
Comment options

@Fianax
Comment options

Answer selected by leelaraj72
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants