Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
title: Text Extractor
emoji: π
colorFrom: gray
colorTo: indigo
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
license: mit
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
.
π§ OCR Text Extractor + Summarizer
An AI-powered tool that extracts text from images using Tesseract OCR and then summarizes it using a transformer model. Upload any image (screenshots, photos, scanned documents, notes) β Get clean extracted text + an AI summary.
π Features
π€ Upload an image with text
π Extracts text using Tesseract OCR
β¨ Summarizes extracted text using HuggingFace transformers
β‘ Fast, simple Gradio UI
π οΈ Works on CPU β no GPU required
π§© How it Works
Image is processed with Tesseract OCR
Extracted text is cleaned
Text is fed into a pretrained summarization model
Output summary is displayed instantly
ποΈ Project Structure βββ app.py βββ requirements.txt βββ packages.txt βββ README.md
π¦ Dependencies Python packages (requirements.txt) gradio pillow pytesseract transformers torch tesseract
System packages (packages.txt) tesseract-ocr tesseract-ocr-eng
These ensure Tesseract OCR runs correctly on HuggingFace Spaces.
βΆοΈ Running Locally pip install -r requirements.txt python app.py
πΈ Demo
Just upload an image β click Submit β done!
π Acknowledgements
Tesseract OCR
HuggingFace Transformers
Gradio for UI
π Try the live Space