Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama parse summarize the information of a paragraph in the pdf file #559

Open
NegTech opened this issue Dec 19, 2024 · 3 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@NegTech
Copy link

NegTech commented Dec 19, 2024

Describe the bug
I'm not sure if it is a bug or not but I'm encountering a problem in parsing PDF. l have some paragraphs which are like 4 or 5 lines each, but when llammaparse parse it the text in the paragraphs turns to 2 lines. The parsed text doesn't even include those paragraphs' full information and concepts.
how can I change it ? is it possible to change ?

Job ID
bbd6df67-b18d-45ec-a9b1-e71ac157f5f7

Client:

  • API
  • Notebook

Additional context
I'm using the accurate mode. and want to get markdown output.

@NegTech NegTech added the bug Something isn't working label Dec 19, 2024
@BinaryBrain
Copy link
Member

Hi @NegTech,
While it seems we have a issue on one of the font, have you also tried to run the job without any parsing instruction? It sometimes lead to better results.

@BinaryBrain BinaryBrain self-assigned this Dec 19, 2024
@NegTech
Copy link
Author

NegTech commented Dec 19, 2024

hiii @BinaryBrain
I have , unfortunately, it didn't give me any better result -_-
what do you mean you have an issue with one of the fonts?

@BinaryBrain
Copy link
Member

One of the font in the file is weirdly encoded (it can happen with PDF). We need to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants