scanned-image-pdfs

Star

Here are 6 public repositories matching this topic...

cseas / ocr-table

Star

Extract tables from scanned image PDFs using Optical Character Recognition.

python shell ocr tesseract optical-character-recognition pdfminer extract-tables scanned-image-pdfs ocr-table

Updated Jun 9, 2020
Python

karolzak / boxdetect

Star

BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.

opencv computer-vision forms checkbox documents checkboxes scanned-documents boxes handwritten-documents cv2 opencv-python bounding-boxes box-detection scanned-images rectangle-detection handwritten-character-recognition handwritten-characters scanned-image-pdfs handwritten-forms

Updated Jan 18, 2023
Python

timberger / Searchable-Image-PDF-Creat-O-Mat

Star

This batch script creates a searchable PDF of a PDF with one or more scanned pages which contain images.

pdf ghostscript imagemagick converter ocr drag drop tesseract scan batch scanned-documents batch-script scanned-pages imagemagick-wrapper searchable-pdfs scanned-image-pdfs tesseract-wrapper ghostscript-wrapper searchable-pdf

Updated Oct 22, 2022
Batchfile

rbrito / pkg-pdfbeads

Star

Debian packaging of pdfbeads

pdf pdf-converter scanned-documents pdf-generation scanning scanned-image-pdfs

Updated May 11, 2020
Ruby

sxaxmz / handle_scanned_pdf

Star

A wrapper on top of python-OCR tools such as pytesseract and easyocr, to recognize and extract text embedded in images. Also, convert scanned-PDFs to text searchable PDFs.

tesseract-ocr pytesseract ocr-python scanned-image-pdfs searchable-pdf easyocr scanned-pdf-documents extract-text-from-image extract-text-from-pdf

Updated Jul 6, 2024
Python

boomalope / misc

Star

Growing collection of scripts that manipulate text data.

ocr twitter jupyter-notebook memory-management ngrams parallel-processing pdftotext manual-annotations textual-analysis pdftoimage ocr-python tagging-tool scanned-image-pdfs preprocessing-data

Updated Jan 3, 2022
Python

Improve this page

Add a description, image, and links to the scanned-image-pdfs topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scanned-image-pdfs topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scanned-image-pdfs

Here are 6 public repositories matching this topic...

cseas / ocr-table

karolzak / boxdetect

timberger / Searchable-Image-PDF-Creat-O-Mat

rbrito / pkg-pdfbeads

sxaxmz / handle_scanned_pdf

boomalope / misc

Improve this page

Add this topic to your repo