We introduce a large-scale image dataset EasyPortrait for portrait segmentation and face parsing. Proposed dataset can be used in several tasks, such as background removal in conference applications, teeth whitening, face skin enhancement, red eye removal or eye colorization, and so on.
EasyPortrait dataset size is about 26GB, and it contains 20 000 RGB images (~17.5K FullHD images) with high quality annotated masks. This dataset is divided into training set, validation set and test set by subject user_id
. The training set includes 14000 images, the validation set includes 2000 images, and the test set includes 4000 images.
Link | Size |
---|---|
images |
26G |
annotations |
235M |
train set |
18.1G |
validation set |
2.6G |
test set |
5.2G |
.
├── images.zip
│ ├── train/ # Train set: 14k
│ ├── val/ # Validation set: 2k
│ ├── test/ # Test set: 4k
├── annotations.zip
│ ├── meta.zip # Meta-information (width, height, brightness, imhash, user_id)
│ ├── train/
│ ├── val/
│ ├── test/
...
We provide some pre-trained models as the baseline for portrait segmentation and face parsing. We use mean Intersection over Union (mIoU) as the main metric.
Model Name | Parameters (M) | Input shape | mIOU |
---|---|---|---|
LR-ASPP + MobileNet-V3 | 1.14 | 1024 × 1024 | 77.55 |
FCN + MobileNet-V2 | 9.71 | 384 × 384 | 74.3 |
FCN + MobileNet-V2 | 9.71 | 512 × 512 | 77.01 |
FCN + MobileNet-V2 | 9.71 | 1024 × 1024 | 81.23 |
FPN + ResNet-50 | 28.5 | 512 × 512 | 83.13 |
FPN + ResNet-50 | 28.5 | 1024 × 1024 | 85.97 |
BiSeNet-V2 | 14.79 | 512 × 512 | 77.93 |
BiSeNet-V2 | 14.79 | 1024 × 1024 | 83.53 |
SegFormer-B0 | 3.72 | 384 × 384 | 79.82 |
SegFormer-B0 | 3.72 | 1024 × 1024 | 84.27 |
SegFormer-B2 | 24.73 | 384 × 384 | 81.59 |
SegFormer-B2 | 24.73 | 512 × 512 | 83.03 |
SegFormer-B2 | 24.73 | 1024 × 1024 | 85.72 |
SegFormer-B5 | 81.97 | 384 × 384 | 81.66 |
SegFormer-B5 | 81.97 | 1024 × 1024 | 85.80 |
SegNeXt + MSCAN-T | 4.23 | 384 × 384 | 75.01 |
SegNeXt + MSCAN-T | 4.23 | 512 × 512 | 78.59 |
Annotations are presented as 2D-arrays, images in *.png
format with several classes:
Index | Class |
---|---|
0 | BACKGROUND |
1 | PERSON |
2 | SKIN |
3 | LEFT BROW |
4 | RIGHT_BROW |
5 | LEFT_EYE |
6 | RIGHT_EYE |
7 | LIPS |
8 | TEETH |
Also, we provide some additional meta-information for dataset in annotations/meta.zip
file:
attachment_id | user_id | data_hash | width | height | brightness | train | test | valid | |
---|---|---|---|---|---|---|---|---|---|
0 | de81cc1c-... | 1b... | e8f... | 1440 | 1920 | 136 | True | False | False |
1 | 3c0cec5a-... | 64... | df5... | 1440 | 1920 | 148 | False | False | True |
2 | d17ca986-... | cf... | a69... | 1920 | 1080 | 140 | False | True | False |
where:
attachment_id
- image file name without extensionuser_id
- unique anonymized user IDdata_hash
- image hash by using Perceptual hashingwidth
- image widthheight
- image heightbrightness
- image brightnesstrain
,test
,valid
are the binary columns for train / test / val subsets respectively
The code is based on MMSegmentation with 0.30.0 version.
Models were trained and evaluated on 8 NVIDIA V100 GPUs with CUDA 11.2.
For installation process follow the instructions here and use the requirements.txt file in our repository.
For single GPU mode:
python ./pipelines/tools/train.py ./pipelines/local_configs/easy_portrait_experiments/<model_dir>/<config_file>.py --gpu-id <GPU_ID>
For distributed training mode:
./pipelines/tools/dist_train.sh ./pipelines/local_configs/easy_portrait_experiments/<model_dir>/<config_file>.py <NUM_GPUS>
For single GPU mode:
python ./pipelines/tools/test.py <PATH_TO_MODEL_CONFIG> <PATH_TO_CHECKPOINT> --gpu-id <GPU_ID> --eval mIoU
For distributed evaluation mode:
./pipelines/tools/dist_test.sh <PATH_TO_MODEL_CONFIG> <PATH_TO_CHECKPOINT> <NUM_GPUS> --eval mIoU
python ./pipelines/demo/image_demo.py <PATH_TO_IMG> <PATH_TO_MODEL_CONFIG> <PATH_TO_CHECKPOINT> --palette=easy_portrait --out-file=<PATH_TO_OUT_FILE>
You can cite the paper using the following BibTeX entry:
@article{EasyPortrait,
title={EasyPortrait - Face Parsing and Portrait Segmentation Dataset},
author={Kapitanov, Alexander and Kvanchiani, Karina and Kirillova Sofia},
journal={arXiv preprint <link>},
year={2023}
}
This work is licensed under a variant of Creative Commons Attribution-ShareAlike 4.0 International License.
Please see the specific license.