The official PyTorch implementation for the following paper:
SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images,
Risa Shionoda, Kuniaki Saito,Shohei Tanaka,Tosho Hirasawa,Yoshitaka Ushiku
The AAAI-25 Workshop on Document Understanding and Intelligence
TL;DR: Introducing the SBSFigures generation pipeline—effortlessly create figures with diverse topics, varied appearances, and precise QAs using a single bash script.
You can download our SBS Figures dataset (1M figures, 4.2M QA pairs) from Hugging Face: Hugging Face Dataset
You can also create SBSFigures using our generation pipeline.
The generation pipeline consists of the following Python scripts.
- data_topic.py : Create figure topic
- json_make.py: Create JSON files representing data point
- add_color.py: Add color information to the JSON files
- create_chart.py: Create chart png files
- qa.py: Create qa pairs using data points
You can create SBSFigures by GPT with the following command:
cd data_gen/gpt
bash crete_sbsfigures.sh
You have to modify config.yaml and write your openai API key. Be careful, this GPT-based generation cost money, and try the lower nuber of Figures. (Initially, we set 15 figures generation attempts per figure type.)
We release four models through Hugging Face.
Task | Model | Checkpoint Path |
---|---|---|
Pretrained | Donut | omron-sinicx/sbsfigures-pretrain-donut |
Fine-tuned (ChartQA) | Donut | omron-sinicx/sbsfigures-chartqa-donut |
Pretrained | Pix2Struct | omron-sinicx/sbsfigures-pretrain-pix2struct |
Fine-tuned (ChartQA) | Pix2Struct | omron-sinicx/sbsfigures-chartqa-pix2struct |
docker build :
docker build -t sbsfigures:latest -f SBSFigures/Dockerfile SBSFigures
docker run :
docker run -it --rm -v SBSFigures:/app SBSFigures:latest /bin/bash
Donut :
cd donut
bash pre-train_sbsfigures.sh
Pix2Struct :
cd pix2struct
bash pre-train_sbsfigures.sh
Donut :
cd donut
bash finetune_chartqa.sh
Pix2Struct :
cd pix2struct
bash finetune_chartqa.sh
For the fine-tuning, we borrow some code from UniChart.
Donut :
cd donut
bash test_chartqa.sh
Pix2Struct :
cd pix2struct
bash test_chartqa.sh
This creates two text files that compare the ground truth with the output results and calculates the 5% relaxed accuracy.
-
Customize fonts:
Editdata_gen/gpt/font.txt
to add or remove fonts based on your environment for chart creation. -
Create domain-specific figures:
Modify the prompt indata_gen/gpt/data_topic.py
to generate figures tailored to a specific domain. -
Add a new figure type:
To add a figure type (e.g., a leather chart), define thecode_format
indata_gen/code_format
, specify its JSON style, and add examples todata_gen/example/data_point/(your new figure type)
.
If you find our work useful for your research, please consider citing our paper:
@article{shinoda2024sbsfigurespretrainingfigure,
title={SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images},
author={Risa Shinoda and Kuniaki Saito and Shohei Tanaka and Tosho Hirasawa and Yoshitaka Ushiku},
year={2024},
journal={arXiv preprint arXiv:2412.17606},
url={https://arxiv.org/abs/2412.17606}
}