Blind-Browser

This project aims to help blind people browse web pages using their hearing by segmenting web pages, converting texts to audio, and describing images. It also helps ordinary people quickly browse dense content with the least possible effort by summarizing texts in segmented web pages.

Description

Blind or visually impaired users can rely on audio to get the content of pages, and regular users can rely on the summary to display the main titles in each piece to save time and effort due to:

Using several algorithms to segment web pages with their different designs (traditional methods, methods based on deep learning techniques and image processing).
Extracting texts from each piece separately and summarizing them.
Describing image pieces using deep learning models.
Converting texts, summarizing them and descriptions into audio for blind users.

Archeticture

A set of proposed algorithms for segmenting web pages(Clustering, SAM(Segment Anything Model), Mathematical Morphology, Edge Detection) , evaluating those algorithms using DSRandom, then choosing the best model from the LLM (Large-Language Models) using ROUGE (Recall-Oriented Understudy for Gisting Evaluation) to summarize text that extracted from segments,then using a VLM (Vision-Language Models ) to describe the image segments, then wrapping the previous algorithms in a web application based on the principle of structure MTV (Model-Template-View).

Tools

Canny
Moments
DBSCAN
KMeans
SAM
Mathematical Morphology
Spacy
sshleifer/distilbart-CNN-12-6
salesforce/blip-image-captioning-base
gTTS (Google Text-to-Speech)
Python
Selenium
Beautiful Soup
TfidfVectorizer
Pandas
Numpy (Numeric Python)
Requests
Transformers
Gaussian Blur
Silhouette Analysis
Hugging Face
Django
JavaScript
jQuery
HTML
CSS

Requirements

download "sam_vit_h_4b8939.pth" from: https://huggingface.co/spaces/abhishek/StableSAM/blob/main/sam_vit_h_4b8939.pth and put it in code folder
download "segment-anything" folder from: https://github.com/facebookresearch/segment-anything and put it in code/myapp/scripts/deeplearning
download "sam2" and "sam2_configs" folders from: https://github.com/facebookresearch/segment-anything-2 and put it in code/myapp/scripts/SAM2/segmentation
download "sam2_hiera_large.pt" from: https://huggingface.co/spaces/SkalskiP/florence-sam/blob/main/checkpoints/sam2_hiera_large.pt and put it in code/myapp/scripts/SAM2/segmentation
download "chrome-win" from: https://download-chromium.appspot.com/ and put it in code
API_KEY in each ScreenshotFetcher class https://api.apiflash.com/

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Demo		Demo
blindbrowser		blindbrowser
myapp		myapp
LubnaRepo.pdf		LubnaRepo.pdf
README.md		README.md
manage.py		manage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blind-Browser

Description

Archeticture

Tools

Requirements

About

Releases

Packages

Languages

Lubnaalhalabi/Blind-Browser

Folders and files

Latest commit

History

Repository files navigation

Blind-Browser

Description

Archeticture

Tools

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages