Bootstrapping Referring Multi-Object Tracking

Bootstrapping Referring Multi-Object Tracking

Yani Zhang, Dongming Wu, Wencheng Han, Xingping Dong

Abstract. Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects following human instruction represented by a natural language expression. Existing RMOT benchmarks are usually formulated through manual annotations, integrated with static regulations. This approach results in a dearth of notable diversity and a constrained scope of implementation. In this work, our key idea is to bootstrap the task of referring multi-object tracking by introducing discriminative language words as much as possible. In specific, we first develop Refer-KITTI into a large-scale dataset, named Refer-KITTI-V2. It starts with 2,719 manual annotations, addressing the issue of class imbalance and introducing more keywords to make it closer to real-world scenarios compared to Refer-KITTI. They are further expanded to a total of 9,758 annotations by prompting large language models, which create 617 different words, surpassing previous RMOT benchmarks. In addition, the end-to-end framework in RMOT is also bootstrapped by a simple yet elegant temporal advancement strategy, which achieves better performance than previous approaches.

Updates

[2024.06.19] Refer-KITTI-V2 is released. Thanks for your patience :)
[2024.06.12] Code is released. Paper is released at arXiv.

Getting Started

Installation

The basic environment setup is on top of MOTR, including conda environment, pytorch version and other requirements.

Dataset

Please refer to the guide for downloading and organization.

Training

You can download COCO pretrained weights from Deformable DETR ''+ iterative bounding box refinement''.

Then training TempRMOT for Refer-KITTI-V2 as following:

sh configs/temp_rmot_train.sh

Then training TempRMOT for Refer-KITTI as following:

sh configs/temp_rmot_train_rk.sh

Note:

If the RoBERTa is not working well, please download the RoBERTa weights from Hugging Face for local using.

Testing

For testing on Refer-KITTI-V2, you can run:

sh configs/temp_rmot_test.sh

For testing on Refer-KITTI, you can run:

sh configs/temp_rmot_test_rk.sh

You can get the main results by runing the evaluation part.

cd TrackEval/script
sh evaluate_rmot.sh

Main Results

Ref-KITTI

Method	Dataset	HOTA	DetA	AssA	DetRe	DetPr	AssRe	AssRe	LocA	URL
TempRMOT	Refer-KITTI	52.21	40.95	66.75	55.65	59.25	71.82	87.76	90.40	model

Ref-KITTI-V2

Method	Dataset	HOTA	DetA	AssA	DetRe	DetPr	AssRe	AssRe	LocA	URL
TempRMOT	Refer-KITTI-V2	35.04	22.97	53.58	34.23	40.41	59.50	81.29	90.07	model

Citation

If you find TempRMOT or Refer-KITTI-V2 useful in your research, please consider citing:

@article{zhang2024bootstrapping,
  title={Bootstrapping Referring Multi-Object Tracking},
  author={Zhang, Yani and Wu, Dongming and Han, Wencheng and Dong, Xingping},
  journal={arXiv preprint arXiv:2406.05039},
  year={2024}
}

Acknowledgements

We thank the authors that open the following projects.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
TrackEval		TrackEval
configs		configs
datasets		datasets
figs		figs
models		models
util		util
.gitignore		.gitignore
README.md		README.md
engine.py		engine.py
eval.py		eval.py
inference.py		inference.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bootstrapping Referring Multi-Object Tracking

Updates

Getting Started

Installation

Dataset

Training

Testing

Main Results

Ref-KITTI

Ref-KITTI-V2

Citation

Acknowledgements

About

Releases

Packages

Languages

zyn213/TempRMOT

Folders and files

Latest commit

History

Repository files navigation

Bootstrapping Referring Multi-Object Tracking

Updates

Getting Started

Installation

Dataset

Training

Testing

Main Results

Ref-KITTI

Ref-KITTI-V2

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages