Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Support for PDF #8

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tushar-prox
Copy link

@tushar-prox tushar-prox commented Nov 14, 2024

Convert PDF pages to JPEG instead of PNG

Unable to test locally so this is in draft state

Changes

  • Modified PDF conversion options to use JPEG format
  • Updated MIME types in data URLs to consistently use image/jpeg
  • Simplified image format handling in API calls
flowchart TD
    A[Start OCR] --> B{Check File Type}
    B -->|PDF| C{Is Remote?}
    B -->|Image| D{Is Remote?}
    
    C -->|Yes| E[Fetch PDF Buffer]
    C -->|No| F[Read Local PDF]
    
    E --> G[Convert PDF to JPEG]
    F --> G
    
    D -->|Yes| H[Use Direct URL]
    D -->|No| I[Encode to Base64]
    
    G --> J[Create Base64 Image URL]
    H --> K[Process with Together AI]
    I --> K
    J --> K
    
    K --> L[Return Markdown]

    style A fill:#f9f,stroke:#333
    style L fill:#9f9,stroke:#333
Loading

Why

  • JPEG format provides better compression for document images while maintaining readable quality
  • Smaller file sizes lead to faster processing and reduced memory usage
  • More consistent handling of image formats throughout the pipeline

Testing

  • Test PDF conversion with local files
  • Test PDF conversion with remote files
  • Verify image quality is sufficient for OCR processing

Added support for PDF using pdf2pic
Added .env in gitignore
Added Error Handling
@Nutlope
Copy link
Owner

Nutlope commented Nov 16, 2024

Love this @tushar-prox! Lemme know when this is ready to review, was planning to work on this feature anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants