Table Of Contents
Table Of Contents

Segmentation

Visualization of Inference Throughputs vs. Validation mIoU of COCO pre-trained models is illustrated in the following graph. Throughputs are measured with single V100 GPU and batch size 16.

../_images/plot_help.png semantic_segmentation

Hint

The model names contain the training information. For instance, fcn_resnet50_voc:

  • fcn indicate the algorithm is “Fully Convolutional Network for Semantic Segmentation” 2.

  • resnet50 is the name of backbone network.

  • voc is the training dataset.

Semantic Segmentation

Table of pre-trained models for semantic segmentation and their performance.

Hint

The test script Download test.py can be used for evaluating the models (VOC results are evaluated using the official server). For example fcn_resnet50_ade:

python test.py --dataset ade20k --model-zoo fcn_resnet50_ade --eval

The training commands work with the script: Download train.py

ADE20K Dataset

Name

Method

pixAcc

mIoU

Command

log

fcn_resnet50_ade

FCN 2

79.0

39.5

shell script

log

fcn_resnet101_ade

FCN 2

80.6

41.6

shell script

log

psp_resnet50_ade

PSP 3

80.1

41.5

shell script

log

psp_resnet101_ade

PSP 3

80.8

43.3

shell script

log

deeplab_resnet50_ade

DeepLabV3 4

80.5

42.5

shell script

log

deeplab_resnet101_ade

DeepLabV3 4

81.1

44.1

shell script

log

MS-COCO Dataset Pretrain

Name

Method

pixAcc

mIoU

Command

log

fcn_resnet101_coco

FCN 2

92.2

66.2

shell script

log

psp_resnet101_coco

PSP 3

92.4

70.4

shell script

log

deeplab_resnet101_coco

DeepLabV3 4

92.5

70.4

shell script

log

Pascal VOC Dataset

Name

Method

pixAcc

mIoU

Command

log

fcn_resnet101_voc

FCN 2

N/A

83.6

shell script

log

psp_resnet101_voc

PSP 3

N/A

85.1

shell script

log

deeplab_resnet101_voc

DeepLabV3 4

N/A

86.2

shell script

log

deeplab_resnet152_voc

DeepLabV3 4

N/A

86.7

shell script

log

Cityscapes Dataset

Name

Method

pixAcc

mIoU

Command

log

psp_resnet101_citys

PSP 3

N/A

77.1

shell script

log

deeplab_v3b_plus_wideresnet_citys

VPLR 5

N/A

83.5

shell script

log

Instance Segmentation

Table of pre-trained models for instance segmentation and their performance.

Hint

The training commands work with the following scripts:

For COCO dataset, training imageset is train2017 and validation imageset is val2017.

Average precision with IoU threshold 0.5:0.95 (averaged 10 values), 0.5 and 0.75 are reported together in the format (AP 0.5:0.95)/(AP 0.5)/(AP 0.75).

For instance segmentation task, both box overlap and segmentation overlap based AP are evaluated and reported.

MS COCO

Model

Box AP

Segm AP

Command

Training Log

mask_rcnn_resnet18_v1b_coco

31.2/51.1/33.1

28.4/48.1/29.8

shell script

log

mask_rcnn_fpn_resnet18_v1b_coco

34.9/56.4/37.4

30.4/52.2/31.4

shell script

log

mask_rcnn_resnet50_v1b_coco

38.3/58.7/41.4

33.1/54.8/35.0

shell script

log

mask_rcnn_fpn_resnet50_v1b_coco

39.2/61.2/42.2

35.4/57.5/37.3

shell script

log

mask_rcnn_resnet101_v1d_coco

41.3/61.7/44.4

35.2/57.8/36.9

shell script

log

mask_rcnn_fpn_resnet101_v1d_coco

42.3/63.9/46.2

37.7/60.5/40.0

shell script

log

1

He, Kaming, Georgia Gkioxari, Piotr Dollár and Ross Girshick. “Mask R-CNN.” In IEEE International Conference on Computer Vision (ICCV), 2017.

2(1,2,3,4,5)

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

3(1,2,3,4,5)

Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. “Pyramid scene parsing network.” CVPR, 2017.

4(1,2,3,4,5)

Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).

5

Zhu, Yi, et al. “Improving Semantic Segmentation via Video Propagation and Label Relaxation.” CVPR 2019.