Soft-PHOC is an intermediate representation of images based on character probability maps.
This work has two implementations based on Pytorch and TensorFlow.
The SoftPhoc annotation.For instance, if the transcription is “PINTU”, we show how we can define the annotation of class “P” for the given transcription based on the value at each level of soft-PHOC descriptor.
A Deep Convolutional Neural Network estimating Soft-PHOC descriptors.
The pytorch implementation of SoftPHOC training.
Find the environmet at: environment.yml
conda install python=3.6 ipython pytorch=0.4 torchvision opencv=3.4.4 tensorboardx mkl=2019 tensorboard tensorflow tqdm scikit-image
- Required packages:
- Pytorch 0.4
- OpenCV 3.4.4
- mkl 2019
- tqm
- scikit-image
- tensorboardX
- For training ICDAR:
bash train_icdar.sh
- For training SynthText:
bash train_synthText.sh
The TensorFlow implementation of Soft-PHOC.
-
Required packages:
- TensorFlow 1.10
- OpenCV 3.4.4
- mkl 2019
- tqm
- scikit-image
- tensorboardX
-
For training:
python fcn_32_train_generator_validation_summary.py
-
Word spotting codes are in
word_spotting
to extract the query word. -
For visualizing the character heatmaps the codes are in
visualize_hm
.
Please cite this work in your publications if it helps your research:
@article{Bazazian18-softPHOC,
author = {D.~Bazazian and D.~Karatzas and A.~Bagdanov},
title = {Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images},
journal = {EPIC workshop at European Conference on Computer Vision Workshop},
year = {2018},
ee = {arxiv.org/pdf/1809.00854.pdf}
}