Test-time task adaptation in few-shot learning aims to adapt a pre-trained task-agnostic model for capturing task-specific knowledge of the test task, rely only on few-labeled support samples. Previous attempts generally focus on developing advanced algorithms to achieve the goal, while neglecting the inherent problems of the given support samples. In fact, with only a handful of samples available, the adverse effect of either the image noise (a.k.a. X-noise) or the label noise (a.k.a. Y-noise) from support samples can be severely amplified. To tackle the problem, in this work we propose DEnoised Task Adaptation (DETA), a unified image- and label-denoising framework orthogonal to existing task adaptation approaches. Without extra supervision, DETA filters out task-irrelevant (i.e. noisy) global and local representations by taking advantage of both global visual information and local region details of support samples. On the challenging Meta-Dataset, DETA consistently improves the performance of a broad spectrum of baseline methods applied on various pre-trained models. Notably, by tackling the overlooked image noise in Meta-Dataset, DETA establishes new state-of-the-art results.
An overview of the proposed DETA (in a 2-way 3-shot exemple). During each iteration of task adaptation, the images together with a collection of cropped local regions of the support samples are first fed into a pre-trained model to extract image and region representations. Next, a Contrastive Relevance Aggregation(CoRA) module takes the region representations as input to determine the weight of each region, based on which we can calculate the image weights by a momentum accumulator. Finally, a local compactness loss and a global dispersion loss are devised in a weighted embedding space for noise-robust representation learning. At inference, we only retain the adapted model to produce image representations of support samples, on which we build a classifier guided by the refined image weights from the accumulator.
-
We propose DETA, a first, unified image- and label-denoising framework for FSL.
-
DETA can be flexibly plugged into different adapter-based and finetuning-based task adaptation paradigms.
-
Extensive experiments on Meta-Dataset demonstrate the effectiveness and flexibility of DETA.
- Image-denoising on vanilla Meta-dataset
- Label-denoising on label-corrupted Meta-dataset
- State-of-the-art Comparison
- Visualization of the cropped regions and calculated weights by CoRA.
- CAM visualization.
- Python 3.6 or greater
- PyTorch 1.0 or greater
- TensorFlow 1.14 or greater
- Clone or download this repository.
- Follow the "User instructions" in the Meta-Dataset repository for "Installation" and "Downloading and converting datasets".
- Edit
./meta-dataset/data/reader.py
in the meta-dataset repository to changedataset = dataset.batch(batch_size, drop_remainder=False)
todataset = dataset.batch(batch_size, drop_remainder=True)
. (The code can run withdrop_remainder=False
, but in our work, we drop the remainder such that we will not use very small batch for some domains and we recommend to drop the remainder for reproducing our methods.)
- Before doing anything, first run the following commands.
ulimit -n 50000 export META_DATASET_ROOT=<root directory of the cloned or downloaded Meta-Dataset repository> export RECORDS=<the directory where tf-records of MetaDataset are stored>
- Enter the root directory of this project, i.e. the directory where this project was cloned or downloaded.
Specify a pretrained model to be adapted, and execute the following command.
- Baseline
python main.py --pretrained_model=MOCO --maxIt=40 --ratio=0. --test.type=10shot
- Ours
python main.py --pretrained_model=MOCO --maxIt=40 --ratio=0. --test.type=10shot --ours --n_regions=2
Note: set ratio=0. for image-denoising, set 0. < ratio < 1.0 for label-denoising.
[1] Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle; Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples; ICLR 2020.
[2] Li, Wei-Hong and Liu, Xialei and Bilen, Hakan; Cross-domain Few-shot Learning with Task-specific Adapters; CVPR 2022.
[3] Xu, Chengming and Yang, Siqian and Wang, Yabiao and Wang, Zhanxiong and Fu, Yanwei and Xue, Xiangyang; Exploring Efficient Few-shot Adaptation for Vision Transformers; Transactions on Machine Learning Research 2022.
[4] Liang, Kevin J and Rangrej, Samrudhdhi B and Petrovic, Vladan and Hassner, Tal; Few-shot learning with noisy labels; CVPR 2022.
[5] Chen, Pengguang, Shu Liu, and Jiaya Jia; Jigsaw clustering for unsupervised visual representation learning; CVPR 2020.
We thank authors of Meta-Dataset, URL/TSA, eTT, JigsawClustering for their source code.
If you hava any questions, you can
- contact me at: jizhang.jim@gmail.com
- new an issue at: https://github.com/nobody-1617/DETA.
To cite our paper, please use following BibTex:
@inproceedings{guo2021general,
title={DETA: Denoised Task Adaptation for Few-Shot Learning},
author={Zhang, Ji and Gao, Lianli and Luo, Xu and Shen, Heng Tao and Song, Jingkuan},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2023}}