Skip to content

(Over)fit OCR to your dataset with genetic algorithms. (outdated, ping for update)

Notifications You must be signed in to change notification settings

HairyFotr/OCRTrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

-------------
 OCR Trainer
-------------

1. Put your images into img/
2. Put your transcriptions into a text file... "filename transcription"
3. Run runTrain <yourTextFile>
4. Wait and wait and wait :)

Things you need:
  sudo apt-get install imagemagick tesseract-ocr gocr cuneiform ocrad

Things you should probably set (a.k.a. things I should abstract away into files):
  allowedCharacters / stringFilter
  the appropriate string scoring algorithm
  different sequence of param-changing algorithms

About

(Over)fit OCR to your dataset with genetic algorithms. (outdated, ping for update)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published