This is the code for CVPR2021 paper "Scene Text Telescope: Text-Focused Scene Image Super-Resolution". [link]
Build up an environment with python3.6, and download corresponding libraries with pip
pip install -r requirement.txt
Here are some outputs with backbone TBSRN while text-focused
Download all resources at BaiduYunDisk with password: stt6, or Dropbox
- TextZoom dataset
- Pretrained weights of CRNN
- Pretrained weights of Transformer-based recognizer
All the resources shoulded be placed under ./dataset/mydata
, for example
./dataset/mydata/train1
./dataset/mydata/train2
./dataset/mydata/pretrain_transformer.pth
...
Please remember to modify the experiment name. Two text-focused modules are activated whenever --text_focus
is used
CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus
CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus --resume YOUR_MODEL --test --test_data_dir ./dataset/mydata/test
CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus --demo --demo_dir ./demo
We inherited most of the frameworks from TextZoom and use the pretrained CRNN model from CRNN. Thanks for your contribution!
@inproceedings{chen2021scene,
title={Scene Text Telescope: Text-Focused Scene Image Super-Resolution},
author={Chen, Jingye and Li, Bin and Xue, Xiangyang},
booktitle={CVPR},
pages={12026--12035},
year={2021}
}