This is the official repository for the ICLR 2024 paper "Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition".
This repo follows the Visual Geo-localization Benchmark. You can refer to it (VPR-datasets-downloader) to prepare datasets.
The dataset should be organized in a directory tree as such:
├── datasets_vg
└── datasets
└── pitts30k
└── images
├── train
│ ├── database
│ └── queries
├── val
│ ├── database
│ └── queries
└── test
├── database
└── queries
Before training, you should download the pre-trained foundation model DINOv2(ViT-L/14) here.
Finetuning on MSLS
python3 train.py --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=msls --queries_per_epoch=30000 --foundation_model_path /path/to/pre-trained/dinov2_vitl14_pretrain.pth
Further finetuning on Pitts30k
python3 train.py --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=pitts30k --queries_per_epoch=5000 --resume /path/to/finetuned/msls/model/SelaVPR_msls.pth
python3 eval.py --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=pitts30k --resume /path/to/finetuned/pitts30k/model/SelaVPR_pitts30k.pth
The model finetuned on MSLS (for diverse scenes).
DOWNLOAD |
MSLS-val | Nordland-test | St. Lucia | ||||||
---|---|---|---|---|---|---|---|---|---|
R@1 | R@5 | R@10 | R@1 | R@5 | R@10 | R@1 | R@5 | R@10 | |
LINK | 90.8 | 96.4 | 97.2 | 85.2 | 95.5 | 98.5 | 99.8 | 100.0 | 100.0 |
The model further finetuned on Pitts30k (only for urban scenes).
DOWNLOAD |
Tokyo24/7 | Pitts30k | Pitts250k | ||||||
---|---|---|---|---|---|---|---|---|---|
R@1 | R@5 | R@10 | R@1 | R@5 | R@10 | R@1 | R@5 | R@10 | |
LINK | 94.0 | 96.8 | 97.5 | 92.8 | 96.8 | 97.7 | 95.7 | 98.8 | 99.2 |
Parts of this repo are inspired by the following repositories:
Visual Geo-localization Benchmark
If you find this repo useful for your research, please consider citing the paper
@inproceedings{selavpr,
title={Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition},
author={Lu, Feng and Zhang, Lijun and Lan, Xiangyuan and Dong, Shuting and Wang, Yaowei and Yuan, Chun},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}