GitHub - doc-doc/HQGA: Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)

HQGA: Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

Todo

Release features of NExT-QA(BERT feature are from NExT-QA)[2021/12/23].

Environment

Anaconda 4.8.4, python 3.6.8, pytorch 1.6 and cuda 10.2. For other libs, please refer to the file requirements.txt.

Install

Please create an env for this project using anaconda (should install anaconda first)

>conda create -n videoqa python==3.6.8
>conda activate videoqa
>git clone https://github.com/doc-doc/HQGA.git
>pip install -r requirements.txt

Data Preparation

We use MSVD-QA as an example to help get farmiliar with the code. Please download the pre-computed features and trained models here

After downloading the data, please create a folder ['data/'] at the same directory as ['HQGA'], then unzip the video and QA features into it. You will have directories like ['data/msvd/' and 'HQGA/'] in your workspace. Please move the model file [.ckpt] into ['HQGA/models/msvd/'].

Usage

Once the data is ready, you can easily run the code. First, to test the environment and code, we provide the prediction and model of the HQGA on MSVD-QA. You can get the results reported in the paper by running:

>python eval_oe.py

The command above will load the prediction file under ['results/msvd/'] and evaluate it. You can also obtain the prediction by running:

>./main.sh 0 test #Test the model with GPU id 0

The command above will load the model under ['models/msvd/'] and generate the prediction file. If you want to train the model (Please follow our paper for details.), please run

>./main.sh 0 train # Train the model with GPU id 0

It will train the model and save to ['models/msvd'].

Result

Models	NExT-Val	NExT-Test	TGIF-Action	TGIF-Transition	TGIF-FrameQA	MSRVTT-QA	MSVD-QA
HQGA	51.42	51.75	76.9	85.6	61.3	38.6	41.2

##Visualization **Example from NExT-QA dataset.

Citation

@inproceedings{xiao2021video,
      title={Video as Conditional Graph Hierarchy for Multi-Granular Question Answering}, 
      author={Junbin Xiao and Angela Yao and Zhiyuan Liu and Yicong Li and Wei Ji and Tat-Seng Chua},
      booktitle={Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI)},
      year={2022},
      pages={2804-2812}
}

Acknowledgement

Our feature extraction for object, frame appearance and motion are from BUTD and HCRN respectively. Many thanks the authors for their great work and code!

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
dataloader		dataloader
dataset		dataset
models		models
networks		networks
results		results
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_vocab.py		build_vocab.py
eval_mc.py		eval_mc.py
eval_oe.py		eval_oe.py
introduction.png		introduction.png
main.sh		main.sh
main_qa.py		main_qa.py
model.png		model.png
requirements.txt		requirements.txt
utils.py		utils.py
videoqa.py		videoqa.py
vis-res.png		vis-res.png
word2vec.py		word2vec.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HQGA: Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

Todo

Environment

Install

Data Preparation

Usage

Result

Citation

Acknowledgement

About

Releases

Packages

Languages

License

doc-doc/HQGA

Folders and files

Latest commit

History

Repository files navigation

HQGA: Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

Todo

Environment

Install

Data Preparation

Usage

Result

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages