Preparing Composition-1k Dataset¶

Introduction¶

@inproceedings{xu2017deep,
  title={Deep image matting},
  author={Xu, Ning and Price, Brian and Cohen, Scott and Huang, Thomas},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={2970--2979},
  year={2017}
}

The Adobe Composition-1k dataset consists of foreground images and their corresponding alpha images. To get the full dataset, one need to composite the foregrounds with selected backgrounds from the COCO dataset and the Pascal VOC dataset.

Obtain and Extract¶

Please follow the instructions of paper authors to obtain the Composition-1k (comp1k) dataset.

Composite the full dataset¶

The Adobe composition-1k dataset contains only alpha and fg (and trimap in test set). It is needed to merge fg with COCO data (training) or VOC data (test) before training or evaluation. Use the following script to perform image composition and generate annotation files for training or testing:

# The script is run under the root folder of MMagic
python tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py data/adobe_composition-1k data/coco data/VOCdevkit --composite

The generated data is stored under adobe_composition-1k/Training_set and adobe_composition-1k/Test_set respectively. If you only want to composite test data (since compositing training data is time-consuming), you can skip compositing the training set by removing the --composite option:

# skip compositing training set
python tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py data/adobe_composition-1k data/coco data/VOCdevkit

If you only want to preprocess test data, i.e. for FBA, you can skip the train set by adding the --skip-train option:

# skip preprocessing training set
python tools/data/matting/comp1k/preprocess_comp1k_dataset.py data/adobe_composition-1k data/coco data/VOCdevkit --skip-train

Currently, GCA and FBA support online composition of training data. But you can modify the data pipeline of other models to perform online composition instead of loading composited images (we called it merged in our data pipeline).

Check Directory Structure for DIM¶

The result folder structure should look like:

mmagic
├── mmagic
├── tools
├── configs
├── data
│   ├── adobe_composition-1k
│   │   ├── Test_set
│   │   │   ├── Adobe-licensed images
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   │   ├── trimaps
│   │   │   ├── merged  (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   │   ├── bg      (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   ├── Training_set
│   │   │   ├── Adobe-licensed images
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   ├── Other
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   ├── merged  (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   │   ├── bg      (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   ├── test_list.json     (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   ├── training_list.json (generated by tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py)
│   ├── coco
│   │   ├── train2014   (or train2017)
│   ├── VOCdevkit
│   │   ├── VOC2012

Prepare the dataset for FBA¶

FBA adopts dynamic dataset augmentation proposed in Learning-base Sampling for Natural Image Matting. In addition, to reduce artifacts during augmentation, it uses the extended version of foreground as foreground. We provide scripts to estimate foregrounds.

Prepare the test set as follows:

# skip preprocessing training set, as it composites online during training
python tools/dataset_converters/matting/comp1k/preprocess_comp1k_dataset.py data/adobe_composition-1k data/coco data/VOCdevkit --skip-train

Extend the foreground of training set as follows:

python tools/dataset_converters/matting/comp1k/extend_fg.py data/adobe_composition-1k

Check Directory Structure for DIM¶

The final folder structure should look like:

mmagic
├── mmagic
├── tools
├── configs
├── data
│   ├── adobe_composition-1k
│   │   ├── Test_set
│   │   │   ├── Adobe-licensed images
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   │   ├── trimaps
│   │   │   ├── merged  (generated by tools/data/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   │   ├── bg      (generated by tools/data/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   ├── Training_set
│   │   │   ├── Adobe-licensed images
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   │   ├── fg_extended (generated by tools/data/matting/comp1k/extend_fg.py)
│   │   │   ├── Other
│   │   │   │   ├── alpha
│   │   │   │   ├── fg
│   │   │   │   ├── fg_extended (generated by tools/data/matting/comp1k/extend_fg.py)
│   │   ├── test_list.json         (generated by tools/data/matting/comp1k/preprocess_comp1k_dataset.py)
│   │   ├── training_list_fba.json (generated by tools/data/matting/comp1k/extend_fg.py)
│   ├── coco
│   │   ├── train2014   (or train2017)
│   ├── VOCdevkit
│   │   ├── VOC2012