Streamlit OCR Image Text Extractor

This is a web-based application that allows users to upload an image containing text and extract the text using Optical Character Recognition (OCR) technology. The application is built with Streamlit and uses Tesseract OCR for text extraction.

Features

Upload images in PNG, JPEG, or JPG formats.
Extract text from images using Tesseract OCR.
Simple and interactive web interface.
Dockerized for easy deployment.

Prerequisites

Python 3.9+
Tesseract OCR

Installing Tesseract OCR

Windows

Download the installer from the UB Mannheim Tesseract page.
Install the application and add the installation directory to your PATH environment variable.
Verify installation:
```
tesseract --version
```

macOS

Install via Homebrew:
```
brew install tesseract
```
Verify installation:
```
tesseract --version
```

Linux

Install via the package manager:

sudo apt update
sudo apt install tesseract-ocr

Verify installation:
```
tesseract --version
```

Local Development

Clone the Repository

git clone https://github.com/jasminaaa20/streamlit-ocr.git
cd streamlit-ocr

Create a Virtual Environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Python Dependencies

pip install -r requirements.txt

Run the Application

streamlit run main.py

Visit http://localhost:8501 in your browser to use the app.

Docker Deployment

Build the Docker Image

docker build -t streamlit-ocr .

Run the Docker Container

docker run -p 8501:8501 streamlit-ocr

Access the app at http://localhost:8501.

Deploy to Google Cloud Run

Prerequisites to Deploy

Install the Google Cloud SDK.
Authenticate with your Google Cloud account:
```
gcloud auth login
```

Enable the Cloud Run API:

gcloud services enable run.googleapis.com

Steps to Deploy

Build and Push the Image to Google Container Registry

gcloud builds submit --tag gcr.io/<PROJECT-ID>/streamlit-ocr

Deploy to Cloud Run

gcloud run deploy streamlit-ocr \
    --image gcr.io/<PROJECT-ID>/streamlit-ocr \
    --platform managed \
    --region <REGION> \
    --allow-unauthenticated

Replace <PROJECT-ID> with your Google Cloud project ID and <REGION> with your desired region.

Access the Application After deployment, you’ll receive a URL to access your app.

Application Structure

.
├── Dockerfile           # Docker configuration file
├── requirements.txt     # Python dependencies
├── app.py               # Streamlit application
├── README.md            # Documentation

Technologies Used

Streamlit: For building the web interface.
Tesseract OCR: For text extraction from images.
Docker: For containerization.
Google Cloud Run: For deployment.

License

Author

Akmal Ali Jasmin

LinkedIn post

Feel free to contribute or raise issues in the repository. Enjoy extracting text from images!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
.vscode		.vscode
test/data		test/data
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Streamlit OCR Image Text Extractor

Features

Prerequisites

Installing Tesseract OCR

Windows

macOS

Linux

Local Development

Clone the Repository

Create a Virtual Environment

Install Python Dependencies

Run the Application

Docker Deployment

Build the Docker Image

Run the Docker Container

Deploy to Google Cloud Run

Prerequisites to Deploy

Steps to Deploy

Application Structure

Technologies Used

License

Author

About

Releases

Packages

Languages

jasminaaa20/streamlit-ocr

Folders and files

Latest commit

History

Repository files navigation

Streamlit OCR Image Text Extractor

Features

Prerequisites

Installing Tesseract OCR

Windows

macOS

Linux

Local Development

Clone the Repository

Create a Virtual Environment

Install Python Dependencies

Run the Application

Docker Deployment

Build the Docker Image

Run the Docker Container

Deploy to Google Cloud Run

Prerequisites to Deploy

Steps to Deploy

Application Structure

Technologies Used

License

Author

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages