A project to index a manga and make the entirety of its text searchable.
Sauce: Mayonaka Heart Tune
This project is powered by cubari.moe for the images. The OCR modes can be chosen between Surya OCR/EasyOCR (default: surya). Will be adding more options in the future
Included example inputs and outputs to help gauge understanding on how it's to be used (eg. using easyocr)
You can run
python comic_ocr.py -i <manga-link>
Or you can skip the parameter and it'll ask you automatically
Run demo.py
to draw bboxes and annotate a sample image.
Running the OCR on local images is possible (though arguably not in the best manner) You'd have to prepend file:/// before your filepaths and then run:
python comic_ocr.py -i file:///<json_path> --no_cubari
Using gists hosted on Github or elsehwere is possible a similar manner (Using the --no_cubari
flag)
To switch between models, use the -m
flag. Currently supported options are surya and easyocr.
This project is licensed under MIT
[!NOTE] : Not affiliated with Comic Sans MS