Pytorch official implementation for our CVPR-2023 highlight paper "Divide and Adapt: Active Domain Adaptation via Customized Learning". More details of this work can be found in our paper: [Arxiv] or [OpenAccess].
Our code is based on CLUE implementation.
Active domain adaptation (ADA) aims to improve the model adaptation performance by incorporating the active learning (AL) techniques to label a maximally-informative subset of target samples. Conventional AL methods do not consider the existence of domain shift, and hence, fail to identify the truly valuable samples in the context of domain adaptation. To accommodate active learning and domain adaption, the two naturally different tasks, in a collaborative framework, we advocate that a customized learning strategy for the target data is the key to the success of ADA solutions. We present Divide-and-Adapt (DiaNA), a new ADA framework that partitions the target instances into four categories with stratified transferable properties. With a novel data subdivision protocol based on uncertainty and domainness, DiaNA can accurately recognize the most gainful samples. While sending the informative instances for annotation, DiaNA employs tailored learning strategies for the remaining categories. Furthermore, we propose an informativeness score that unifies the data partitioning criteria. This enables the use of a Gaussian mixture model (GMM) to automatically sample unlabeled data into the proposed four categories. Thanks to the "divide-and-adapt" spirit, DiaNA can handle data with large variations of domain gap. In addition, we show that DiaNA can generalize to different domain adaptation settings, such as unsupervised domain adaptation (UDA), semi-supervised domain adaptation (SSDA), source-free domain adaptation (SFDA), etc.
- Create an anaconda environment with Python 3.6 and activate:
conda create -n diana python=3.6.8
conda activate diana
- Install dependencies:
pip install -r requirements.txt
For DomainNet, follow the following steps:
- Download the original dataset for the domains of interest from this link – eg. Clipart and Sketch.
- Run:
python preprocess_domainnet.py --input_dir data/domainNet/ \
--domains 'clipart,sketch' \
--output_dir 'data/post_domainNet/'
At round 0, active adaptation begins from a model trained on the source domain, or from a model first trained on source and then adapted to the target via unsupervised domain adaptation. Skip the this step if you want to train from scratch. Otherwise, download the checkpoints pretrained on each source domain at this link and unzip them to checkpoints/source/domainnet
folder.
We include hyperparameter configurations to reproduce paper numbers on DomainNet as configurations inside the config
folder. For instance, to reproduce DomainNet (Clipart->Sketch) results with DiaNA. Run:
python train.py --cfg_file config/domainnet/clipart2sketch.yml -a GMM -d self_ft
To reproduce results with CLUE+MME, run:
python train.py --cfg_file config/domainnet/clipart2sketch.yml -a CLUE -d mme
For other baseline methods, run:
python train.py --cfg_file config/domainnet/clipart2sketch.yml -a uniform
All the supported baselines currently:
Please cite the following paper and star this project if you use this repository in your research. Thank you!
@inproceedings{huang2023divide,
title={Divide and adapt: Active domain adaptation via customized learning},
author={Huang, Duojun and Li, Jichang and Chen, Weikai and Huang, Junshi and Chai, Zhenhua and Li, Guanbin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7651--7660},
year={2023}
}