We provide several helpful scripts to prepare data, to convert VISSL models to detectron2 compatible models or to convert caffe2 models to VISSL compatible models.
VISSL supports benchmarks inspired by the VTAB and CLIP papers, for which the datasets do not directly exist but are transformations of existing dataset.
To run these benchmarks, the following data preparation scripts are mandatory:
create_clevr_count_data_files.py
: to create adisk_filelist
dataset from CLEVR where the goal is to count the number of object in the scenecreate_clevr_dist_data_files.py
: to create adisk_filelist
dataset from CLEVR where the goal is to estimate the distance to the closest object in the scenecreate_dsprites_location_data_files.py
: to create adisk_folder
dataset from dSprites where the goal is to estimate the x coordinate of the sprite on the scenecreate_dsprites_orientation_data_files.py
: to create adisk_folder
dataset from dSprites where the goal is to estimate the orientation of the sprite on the scenecreate_euro_sat_data_files.py
: to transform the EUROSAT dataset to thedisk_folder
formatcreate_food101_data_files.py
: to transform the FOOD101 dataset to thedisk_folder
formatcreate_kitti_dist_data_files.py
: to create adisk_folder
dataset from KITTI where the goal is to estimate the distance of the closest car, van or truckcreate_patch_camelyon_data_files.py
: to transform the PatchCamelyon dataset to thedisk_folder
formatcreate_small_norb_azimuth_data_files.py
to create adisk_folder
dataset from Small NORB where the goal is to find the azimuth or the photographed objectcreate_small_norb_elevation_data_files.py
to create adisk_folder
dataset from Small NORB where the goal is to predict the elevation in the imagecreate_ucf101_data_files.py
: to create adisk_folder
image action recognition dataset from the video action recognition dataset UCF101 by extracting the middle frame
All of these scripts follow the same easy to use interface:
python create_[***]_data_files.py -i /path/to/input_datset -o /path/to/tranformed/dataset -d
-i
gives the path to the official dataset format-o
gives the path to the output transformed dataset (the one to feed to VISSL)-d
(optional) automatically downloads the dataset in the input path
Scripts producing a disk_filelist
format will create the following structure:
output_folder/
train_images.npy # Paths to the train images
train_labels.npy # Labels for each of the train images
val_images.npy # Paths to the val images
val_labels.npy # Labels for each of the val images
These files should be referenced in the dataset_catalog.json
like so:
"dataset_filelist": {
"train": ["/path/to/train_images.npy", "/path/to/train_labels.npy"],
"val": ["/path/to/val_images.npy", "/path/to/val_labels.npy"]
},
Scripts producing a disk_folder
format will create the following structure:
train/
label1/
image_1.jpeg
image_2.jpeg
...
label2/
image_x.jpeg
image_y.jpeg
...
...
val/
label1/
image_1.jpeg
image_2.jpeg
...
label2/
image_x.jpeg
image_y.jpeg
...
...
These files should be referenced in the dataset_catalog.json
like so:
"dataset_folder": {
"train": ["/path/to/dataset/train", "<ignored>"],
"val": ["/path/to/dataset/val", "<ignored>"]
},
The following sections will describe each of these data preparation scripts in detail.
Run the create_clevr_count_data_files.py
script with the -d
option as follows:
python extra_scripts/create_clevr_count_data_files.py \
-i /path/to/clevr/ \
-o /output_path/to/clevr_count
-d
The folder /output_path/clevr_count
now contains the CLEVR/Counts disk_filelist
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"clevr_count_filelist": {
"train": ["/output_path/to/clevr_count/train_images.npy", "/output_path/to/clevr_count/train_labels.npy"],
"val": ["/output_path/to/clevr_count/val_images.npy", "/output_path/to/clevr_count/val_labels.npy"]
},
Download the full dataset by visiting CLEVR website and clicking on Download CLEVR v1.0 (18 GB) dataset. Expand the archive.
The resulting folder should have the following structure:
/path/to/clevr/
CLEVR_v1.0/
COPYRIGHT.txt
LICENSE.txt
README.txt
images/
train/
... 75000 images ...
val/
... 15000 images ...
test/
... 15000 images ...
questions/
CLEVR_test_questions.json
CLEVR_train_questions.json
CLEVR_val_questions.json
scenes/
CLEVR_train_scenes.json
CLEVR_val_scenes.json
Run the script where /path/to/clevr/
is the path of the folder containing the CLEVR_v1.0
folder:
python extra_scripts/create_clevr_count_data_files.py \
-i /path/to/clevr/ \
-o /output_path/to/clevr_count
The folder /output_path/clevr_count
now contains the CLEVR/Counts dataset.
Follow the exact same steps as for the preparation of the CLEVR/Count dataset described above, but use create_clevr_dist_data_files.py
instead of create_clevr_count_data_files.py
.
Once the dataset is prepared and available at /path/to/clevr_dist
, the last step is to set this path in dataset_catalog.json
and you are good to go:
"clevr_dist_filelist": {
"train": ["/path/to/clevr_dist/train_images.npy", "/path/to/clevr_dist/train_labels.npy"],
"val": ["/path/to/clevr_dist/val_images.npy", "/path/to/clevr_dist/val_labels.npy"]
},
Run the create_dsprites_location_data_files.py
script with the -d
option as follows:
python extra_scripts/create_dsprites_location_data_files.py \
-i /path/to/dsprites/ \
-o /output_path/to/dsprites_loc
-d
The folder /output_path/to/dsprites_loc
now contains the dSprites/location disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"dsprites_loc_folder": {
"train": ["/output_path/to/dsprites_loc/train", "<ignored>"],
"val": ["/output_path/to/dsprites_loc/val", "<ignored>"]
},
Run the create_dsprites_orientation_data_files.py
script with the -d
option as follows:
python extra_scripts/create_dsprites_orientation_data_files.py \
-i /path/to/dsprites/ \
-o /output_path/to/dsprites_orient
-d
The folder /output_path/to/dsprites_orient
now contains the dSprites/orientation disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"dsprites_orient_folder": {
"train": ["/output_path/to/dsprites_orient/train", "<ignored>"],
"val": ["/output_path/to/dsprites_orient/val", "<ignored>"]
},
Run the create_euro_sat_data_files.py
script with the -d
option as follows:
python extra_scripts/create_euro_sat_data_files.py \
-i /path/to/euro_sat/ \
-o /output_path/to/euro_sat
-d
The folder /output_path/to/euro_sat
now contains the EuroSAT disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"euro_sat_folder": {
"train": ["/output_path/to/euro_sat/train", "<ignored>"],
"val": ["/output_path/to/euro_sat/val", "<ignored>"]
},
Run the create_food101_data_files.py
script with the -d
option as follows:
python extra_scripts/create_food101_data_files.py \
-i /path/to/food101/ \
-o /output_path/to/food101
-d
The folder /output_path/to/food101
now contains the Food101 disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"food101_folder": {
"train": ["/output_path/to/food101/train", "<ignored>"],
"val": ["/output_path/to/food101/val", "<ignored>"]
},
Run the create_kitti_dist_data_files.py
script with the -d
option as follows:
python extra_scripts/create_kitti_dist_data_files.py \
-i /path/to/kitti/ \
-o /output_path/to/kitti_distance
-d
The folder /output_path/to/kitti_distance
now contains the KITTI/distance disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"kitti_dist_folder": {
"train": ["/output_path/to/kitti_distance/train", "<ignored>"],
"val": ["/output_path/to/kitti_distance/val", "<ignored>"]
},
Run the create_patch_camelyon_data_files.py
script with the -d
option as follows:
python extra_scripts/create_patch_camelyon_data_files.py \
-i /path/to/pcam/ \
-o /output_path/to/pcam
-d
The folder /output_path/to/pcam
now contains the Patch Camelyon disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"pcam_folder": {
"train": ["/output_path/to/pcam/train", "<ignored>"],
"val": ["/output_path/to/pcam/val", "<ignored>"]
},
Run the create_small_norb_azimuth_data_files.py
script with the -d
option as follows:
python extra_scripts/create_small_norb_azimuth_data_files.py \
-i /path/to/snorb/ \
-o /output_path/to/snorb_azimuth
-d
The folder /output_path/to/snorb_azimuth
now contains the SmallNORB/azimuth disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"small_norb_azimuth_folder": {
"train": ["/output_path/to/snorb_azimuth/train", "<ignored>"],
"val": ["/output_path/to/snorb_azimuth/val", "<ignored>"]
},
Run the create_small_norb_elevation_data_files.py
script with the -d
option as follows:
python extra_scripts/create_small_norb_elevation_data_files.py \
-i /path/to/snorb/ \
-o /output_path/to/snorb_elevation
-d
The folder /output_path/to/snorb_elevation
now contains the SmallNORB/elevation disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"small_norb_elevation_folder": {
"train": ["/output_path/to/snorb_elevation/train", "<ignored>"],
"val": ["/output_path/to/snorb_elevation/val", "<ignored>"]
},
Run the create_stanford_cars_data_files.py
script with the -d
option as follows:
python extra_scripts/create_stanford_cars_data_files.py \
-i /path/to/cars/ \
-o /output_path/to/cars
-d
The folder /output_path/to/cars
now contains the Stanford Cars disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"stanford_cars_folder": {
"train": ["/output_path/to/cars/train", "<ignored>"],
"val": ["/output_path/to/cars/val", "<ignored>"]
},
Run the create_ucf101_data_files.py
script with the -d
option as follows:
python extra_scripts/create_ucf101_data_files.py \
-i /path/to/ucf101/ \
-o /output_path/to/ucf101
-d
The folder /output_path/ucf101
now contains the UCF101 image action recognition disk_folder
dataset.
The last step is to set this path in dataset_catalog.json
and you are good to go:
"ucf101_folder": {
"train": ["/output_path/to/ucf101/train", "<ignored>"],
"val": ["/output_path/to/ucf101/val", "<ignored>"]
},
Download the full dataset by visiting the UCF101 website:
- Click on The UCF101 data set can be downloaded by "clicking here" to retrieve the data (all the videos).
- Click on The Train/Test Splits for Action Recognition on UCF101 data set can be downloaded by clicking here to retrieve the splits.
Expand both archives in the same folder, say /path/to/ucf101
.
The resulting folder should have the following structure:
ucf101/
UCF-101/
ApplyEyeMakeup/
... videos ...
ApplyLipstick/
... videos ...
ucfTrainTestlist/
classInd.txt
testlist01.txt
testlist02.txt
testlist03.txt
trainlist01.txt
trainlist02.txt
trainlist03.txt
Run the following commands (where /path/to/ucf101
is the path of the folder above):
python extra_scripts/create_ucf101_data_files.py \
-i /path/to/ucf101/ \
-o /output_path/to/ucf101
The folder /output_path/ucf101
now contains the UCF101 image action recognition dataset.
The following scripts are optional as VISSL's dataset_catalog.py
supports reading the downloaded data directly for most of these. For all the datasets below, we assume that the datasets are in the format as described here.
python extra_scripts/create_coco_data_files.py \
--json_annotations_dir /path/to/coco/annotations/ \
--output_dir /tmp/vissl/datasets/coco/ \
--train_imgs_path /path/to/coco/train2014 \
--val_imgs_path /path/to/coco/val2014
-
for VOC2007 data_source_dir='/mnt/fair/VOC2007/' output_dir='/path/to/my/output/dir/voc2007/'
-
for VOC2012 data_source_dir='/mnt/fair/VOC2012/' output_dir='/path/to/my/output/dir/voc2012'
- For VOC2007 dataset:
python extra_scripts/create_voc_data_files.py \
--data_source_dir /path/to/VOC2007/ \
--output_dir /tmp/vissl/datasets/voc07/
- For VOC2012 dataset:
python extra_scripts/create_voc_data_files.py \
--data_source_dir /path/to/VOC2012/ \
--output_dir /tmp/vissl/datasets/voc12/
python extra_scripts/create_imagenet_data_files.py \
--data_source_dir /path/to/imagenet_full_size/ \
--output_dir /tmp/vissl/datasets/imagenet1k/
python extra_scripts/create_imagenet_data_files.py \
--data_source_dir /path/to/places205/ \
--output_dir /tmp/vissl/datasets/places205/
python extra_scripts/create_imagenet_data_files.py \
--data_source_dir /path/to/places365/ \
--output_dir /tmp/vissl/datasets/places365/
Low-shot image classification is one of the benchmark tasks in the paper. VISSL support low-shot sampling and benchmarking on the PASCAL VOC dataset only.
We train on trainval
split of VOC2007 dataset which has 5011 images and 20 classes.
Hence the labels are of shape 5011 x 20. We generate 5 independent samples (for a given low-shot value k
) by essentially generating 5 independent target files. For each class, we randomly
pick the positive k
samples and 19 * k
negatives. Rest of the samples are ignored. We perform low-shot image classification on various different layers on the model (AlexNet, ResNet-50). The targets targets_data_file
is usually obtained by extracting features for a given layer. Below command generates 5 samples for various k
values:
python extra_scripts/create_voc_low_shot_samples.py \
--targets_data_file /path/to/voc/numpy_targets.npy \
--output_path /tmp/vissl/datasets/voc07/low_shot/labels/ \
--k_values "1,2,4,8,16,32,64,96" \
--num_samples 5
We provide scripts to change problem complexity of Jigsaw approach (as an axis of scaling in paper).
For the problem of Jigsaw, we vary the number of permutations used
to solve the jigsaw task. In the paper, the permutations used ∈
[100, 2000, 10000]. We provide these permutations files for download here. To generate the permutations, use the command below:
python extra_scripts/generate_jigsaw_permutations.py \
--output_dir /tmp/vissl//jigsaw_perms/ \
-- N 2000
We provide scripts to convert VISSL models to Detectron2 and ClassyVision compatible models.
All the ResNe(X)t models in VISSL can be converted to Detectron2 weights using following command:
python extra_scripts/convert_vissl_to_detectron2.py \
--input_model_file <input_model>.pth \
--output_model <d2_model>.torch \
--weights_type torch \
--state_dict_key_name classy_state_dict
All the ResNe(X)t models in VISSL can be converted to Detectron2 weights using following command:
python extra_scripts/convert_vissl_to_classy_vision.py \
--input_model_file <input_model>.pth \
--output_model <d2_model>.torch \
--state_dict_key_name classy_state_dict
All the ResNe(X)t models in VISSL can be converted to Torchvision weights using following command:
python extra_scripts/convert_vissl_to_torchvision.py \
--model_url_or_file <input_model>.pth \
--output_dir /path/to/output/dir/ \
--output_name <my_converted_model>.torch
We provide conversion of all the caffe2 models in the paper
All the models have been added to ICCV19_MODEL_ZOO_FB.md
.
Jigsaw model:
python extra_scripts/convert_caffe2_to_torchvision_resnet.py \
--c2_model <model>.pkl \
--output_model <pth_model>.torch \
--jigsaw True --bgr2rgb True
Colorization model:
python extra_scripts/convert_caffe2_to_torchvision_resnet.py \
--c2_model <model>.pkl \
--output_model <pth_model>.torch \
--bgr2rgb False
Supervised model:
python extra_scripts/convert_caffe2_to_pytorch_rn50.py \
--c2_model <model>.pkl \
--output_model <pth_model>.torch \
--bgr2rgb True
AlexNet Jigsaw models:
python extra_scripts/convert_caffe2_to_vissl_alexnet.py \
--weights_type caffe2 \
--model_name jigsaw \
--bgr2rgb True \
--input_model_weights <model.pkl> \
--output_model <pth_model>.torch
AlexNet Colorization models:
python extra_scripts/convert_caffe2_to_vissl_alexnet.py \
--weights_type caffe2 \
--model_name colorization \
--input_model_weights <model.pkl> \
--output_model <pth_model>.torch
AlexNet Supervised models:
python extra_scripts/convert_caffe2_to_vissl_alexnet.py \
--weights_type caffe2 \
--model_name supervised \
--bgr2rgb True \
--input_model_weights <model.pkl> \
--output_model <pth_model>.torch
We provide scripts to convert ClassyVision models to VISSL compatible models.
python extra_scripts/convert_classy_vision_to_vissl_resnet.py \
--input_model_file <input_model>.pth \
--output_model <d2_model>.torch \
--depth 50
AlexNet RotNet model:
python extra_scripts/convert_caffe2_to_vissl_alexnet.py \
--weights_type torch \
--model_name rotnet \
--input_model_weights <model> \
--output_model <pth_model>.torch
AlexNet DeepCluster model:
python extra_scripts/convert_alexnet_models.py \
--weights_type torch \
--model_name deepcluster \
--input_model_weights <model> \
--output_model <pth_model>.torch