Table Of Contents
Table Of Contents

gluoncv.model_zoo

GluonCV Model Zoo

gluoncv.model_zoo.get_model

Returns a pre-defined GluonCV model by name.

Hint

This is the recommended method for getting a pre-defined model.

It support directly loading models from Gluon Model Zoo as well.

get_model

Returns a pre-defined model by name

Image Classification

CIFAR

get_cifar_resnet

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper..

cifar_resnet20_v1

ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

cifar_resnet56_v1

ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

cifar_resnet110_v1

ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

cifar_resnet20_v2

ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

cifar_resnet56_v2

ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

cifar_resnet110_v2

ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

get_cifar_wide_resnet

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper..

cifar_wideresnet16_10

WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.

cifar_wideresnet28_10

WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.

cifar_wideresnet40_8

WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

ImageNet

We apply dilattion strategy to pre-trained ResNet models (with stride of 8). Please see gluoncv.model_zoo.SegBaseModel for how to use it.

ResNetV1b

Pre-trained ResNetV1b Model, which produces the strides of 8 featuremaps at conv5.

resnet18_v1b

Constructs a ResNetV1b-18 model.

resnet34_v1b

Constructs a ResNetV1b-34 model.

resnet50_v1b

Constructs a ResNetV1b-50 model.

resnet101_v1b

Constructs a ResNetV1b-101 model.

resnet152_v1b

Constructs a ResNetV1b-152 model.

Object Detection

SSD

SSD

Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.

get_ssd

Get SSD models.

ssd_300_vgg16_atrous_voc

SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.

ssd_300_vgg16_atrous_coco

SSD architecture with VGG16 atrous 300x300 base network for COCO.

ssd_300_vgg16_atrous_custom

SSD architecture with VGG16 atrous 300x300 base network for COCO.

ssd_512_vgg16_atrous_voc

SSD architecture with VGG16 atrous 512x512 base network.

ssd_512_vgg16_atrous_coco

SSD architecture with VGG16 atrous layers for COCO.

ssd_512_vgg16_atrous_custom

SSD architecture with VGG16 atrous 300x300 base network for COCO.

ssd_512_resnet50_v1_voc

SSD architecture with ResNet v1 50 layers.

ssd_512_resnet50_v1_coco

SSD architecture with ResNet v1 50 layers for COCO.

ssd_512_resnet50_v1_custom

SSD architecture with ResNet50 v1 512 base network for custom dataset.

ssd_512_resnet101_v2_voc

SSD architecture with ResNet v2 101 layers.

ssd_512_resnet152_v2_voc

SSD architecture with ResNet v2 152 layers.

VGGAtrousExtractor

VGG Atrous multi layer feature extractor which produces multiple output feature maps.

get_vgg_atrous_extractor

Get VGG atrous feature extractor networks.

vgg16_atrous_300

Get VGG atrous 16 layer 300 in_size feature extractor networks.

vgg16_atrous_512

Get VGG atrous 16 layer 512 in_size feature extractor networks.

Faster RCNN

FasterRCNN

Faster RCNN network.

get_faster_rcnn

Utility function to return faster rcnn networks.

faster_rcnn_resnet50_v1b_voc

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.

faster_rcnn_resnet50_v1b_coco

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.

faster_rcnn_resnet50_v1b_custom

Faster RCNN model with resnet50_v1b base network on custom dataset.

YOLOv3

YOLOV3

YOLO V3 detection network.

get_yolov3

Get YOLOV3 models.

yolo3_darknet53_voc

YOLO3 multi-scale with darknet53 base network on VOC dataset.

yolo3_darknet53_coco

YOLO3 multi-scale with darknet53 base network on COCO dataset.

yolo3_darknet53_custom

YOLO3 multi-scale with darknet53 base network on custom dataset.

Instance Segmentation

Mask RCNN

MaskRCNN

Mask RCNN network.

get_mask_rcnn

Utility function to return mask rcnn networks.

mask_rcnn_resnet50_v1b_coco

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R.

Semantic Segmentation

FCN

FCN

Fully Convolutional Networks for Semantic Segmentation

get_fcn

FCN model from the paper “Fully Convolutional Network for semantic segmentation”

get_fcn_resnet50_voc

FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

get_fcn_resnet101_voc

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

get_fcn_resnet101_coco

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

get_fcn_resnet50_ade

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

get_fcn_resnet101_ade

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

PSPNet

PSPNet

Pyramid Scene Parsing Network

get_psp

Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on.

get_psp_resnet101_coco

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_psp_resnet101_voc

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_psp_resnet50_ade

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_psp_resnet101_ade

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

DeepLabV3

DeepLabV3

param nclass

Number of categories for the training dataset.

get_deeplab

DeepLabV3 :param dataset: The dataset that model pretrained on.

get_deeplab_resnet101_coco

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_deeplab_resnet101_voc

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_deeplab_resnet50_ade

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

get_deeplab_resnet101_ade

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

API Reference

Network definitions of GluonCV models

GluonCV Model Zoo

class gluoncv.model_zoo.AlexNet(classes=1000, **kwargs)[source]

AlexNet model from the “One weird trick…” paper.

Parameters

classes (int, default 1000) – Number of classes for the output layer.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BasicBlockV1(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 18, 34 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BasicBlockV1b(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, **kwargs)[source]

ResNetV1b BasicBlockV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BasicBlockV2(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 18, 34 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BottleneckV1(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 50, 101, 152 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BottleneckV1b(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, last_gamma=False, **kwargs)[source]

ResNetV1b BottleneckV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.BottleneckV2(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 50, 101, 152 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.DarknetV3(layers, channels, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Darknet v3.

Parameters
features

Feature extraction layers.

Type

mxnet.gluon.nn.HybridSequential

output

A classes(1000)-way Fully-Connected Layer.

Type

mxnet.gluon.nn.Dense

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.DeepLabV3(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, height=None, width=None, base_size=520, crop_size=480, **kwargs)[source]
Parameters
  • nclass (int) – Number of categories for the training dataset.

  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).

  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).

  • aux (bool) – Auxiliary loss.

Reference:

Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.DeepLabV3Plus(nclass, backbone='xception', aux=True, ctx=cpu(0), pretrained_base=True, height=None, width=None, base_size=576, crop_size=512, dilated=True, **kwargs)[source]
Parameters
  • nclass (int) – Number of categories for the training dataset.

  • backbone (string) – Pre-trained dilated backbone network type (default:’xception’).

  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).

  • aux (bool) – Auxiliary loss.

Reference:

Chen, Liang-Chieh, et al. “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.”

evaluate(x)[source]

evaluating network with inputs and targets

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.DenseNet(num_init_features, growth_rate, block_config, bn_size=4, dropout=0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Densenet-BC model from the “Densely Connected Convolutional Networks” paper.

Parameters
  • num_init_features (int) – Number of filters to learn in the first convolution layer.

  • growth_rate (int) – Number of filters to add each layer (k in the paper).

  • block_config (list of int) – List of integers for numbers of layers in each pooling block.

  • bn_size (int, default 4) – Multiplicative factor for number of bottle neck layers. (i.e. bn_size * k features in the bottleneck layer)

  • dropout (float, default 0) – Rate of dropout after each dense layer.

  • classes (int, default 1000) – Number of classification classes.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.FCN(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]

Fully Convolutional Networks for Semantic Segmentation

Parameters
  • nclass (int) – Number of categories for the training dataset.

  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).

  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm;

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

  • pretrained_base (bool or str) – Refers to if the FCN backbone or the encoder is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.

Reference:

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” CVPR, 2015

Examples

>>> model = FCN(nclass=21, backbone='resnet50')
>>> print(model)
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.FasterRCNN(features, top_features, classes, box_features=None, short=600, max_size=1000, min_stage=4, max_stage=4, train_patterns=None, nms_thresh=0.3, nms_topk=400, post_nms=100, roi_mode='align', roi_size=(14, 14), strides=16, clip=None, rpn_channel=1024, base_size=16, scales=(8, 16, 32), ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7, rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000, rpn_test_post_nms=300, rpn_min_size=16, num_sample=128, pos_iou_thresh=0.5, pos_ratio=0.25, max_num_gt=300, additional_output=False, force_nms=False, **kwargs)[source]

Faster RCNN network.

Parameters
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.

  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.

  • classes (iterable of str) – Names of categories, its length is num_class.

  • box_features (gluon.HybridBlock, default is None) – feature head for transforming shared ROI output (top_features) for box prediction. If set to None, global average pooling will be used.

  • short (int, default is 600.) – Input image short side size.

  • max_size (int, default is 1000.) – Maximum size of input image long side.

  • min_stage (int, default is 4) – Minimum stage NO. for FPN stages.

  • max_stage (int, default is 4) – Maximum stage NO. for FPN stages.

  • train_patterns (str, default is None.) – Matching pattern for trainable parameters.

  • nms_thresh (float, default is 0.3.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.

  • nms_topk (int, default is 400) –

    Apply NMS to top k detection results, use -1 to disable so that every Detection

    result is used in NMS.

  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

  • roi_mode (str, default is align) – ROI pooling mode. Currently support ‘pool’ and ‘align’.

  • roi_size (tuple of int, length 2, default is (14, 14)) – (height, width) of the ROI region.

  • strides (int/tuple of ints, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size. For FPN, use a tuple of ints.

  • clip (float, default is None) – Clip bounding box target to this value.

  • rpn_channel (int, default is 1024) – Channel number used in RPN convolutional layers.

  • base_size (int) – The width(and height) of reference anchor box.

  • scales (iterable of float, default is (8, 16, 32)) –

    The areas of anchor boxes. We use the following form to compute the shapes of anchors:

    \[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]

  • ratios (iterable of float, default is (0.5, 1, 2)) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.

  • alloc_size (tuple of int) – Allocate size for the anchor boxes as (H, W). Usually we generate enough anchors for large feature map, e.g. 128x128. Later in inference we can have variable input sizes, at which time we can crop corresponding anchors from this large anchor map so we can skip re-generating anchors for each input.

  • rpn_train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training of RPN.

  • rpn_train_post_nms (int, default is 2000) – Return top proposal results after NMS in training of RPN. Will be set to rpn_train_pre_nms if it is larger than rpn_train_pre_nms.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN. Will be set to rpn_test_pre_nms if it is larger than rpn_test_pre_nms.

  • rpn_nms_thresh (float, default is 0.7) – IOU threshold for NMS. It is used to remove overlapping proposals.

  • train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training.

  • train_post_nms (int, default is 2000) – Return top proposal results after NMS in training.

  • test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing.

  • test_post_nms (int, default is 300) – Return top proposal results after NMS in testing.

  • rpn_min_size (int, default is 16) – Proposals whose size is smaller than min_size will be discarded.

  • num_sample (int, default is 128) – Number of samples for RCNN targets.

  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.

  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.

  • max_num_gt (int, default is 300) – Maximum ground-truth number in whole training dataset. This is only an upper bound, not necessarily very precise. However, using a very big number may impact the training speed.

  • additional_output (boolean, default is False) – additional_output is only used for Mask R-CNN to get internal outputs.

  • force_nms (bool, default is False) – Appy NMS to all categories, this is to avoid overlapping detection results from different categories.

classes

Names of categories, its length is num_class.

Type

iterable of str

num_class

Number of positive categories.

Type

int

short

Input image short side size.

Type

int

max_size

Maximum size of input image long side.

Type

int

train_patterns

Matching pattern for trainable parameters.

Type

str

nms_thresh

Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.

Type

float

nms_topk
Apply NMS to top k detection results, use -1 to disable so that every Detection

result is used in NMS.

Type

int

force_nms

Appy NMS to all categories, this is to avoid overlapping detection results from different categories.

Type

bool

post_nms

Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

Type

int

target_generator

Generate training targets with boxes, samples, matches, gt_label and gt_box.

Type

gluon.Block

hybrid_forward(F, x, gt_box=None)[source]

Forward Faster-RCNN network.

The behavior during training and inference is different.

Parameters
  • x (mxnet.nd.NDArray or mxnet.symbol) – The network input tensor.

  • gt_box (type, only required during training) – The ground-truth bbox tensor with shape (1, N, 4).

Returns

During inference, returns final class id, confidence scores, bounding boxes.

Return type

(ids, scores, bboxes)

reset_class(classes, reuse_weights=None)[source]

Reset class categories and class predictors.

Parameters
  • classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.

  • reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.

Example

>>> net = gluoncv.model_zoo.get_model('faster_rcnn_resnet50_v1b_coco', pretrained=True)
>>> # use direct name to name mapping to reuse weights
>>> net.reset_class(classes=['person'], reuse_weights={'person':'person'})
>>> # or use interger mapping, person is the 14th category in VOC
>>> net.reset_class(classes=['person'], reuse_weights={0:14})
>>> # you can even mix them
>>> net.reset_class(classes=['person'], reuse_weights={'person':14})
>>> # or use a list of string if class name don't change
>>> net.reset_class(classes=['person'], reuse_weights=['person'])
property target_generator

Returns stored target generator

Returns

The RCNN target generator

Return type

mxnet.gluon.HybridBlock

class gluoncv.model_zoo.Inception3(classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.MaskRCNN(features, top_features, classes, mask_channels=256, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, target_roi_scale=1, num_fcn_convs=0, norm_layer=None, norm_kwargs=None, **kwargs)[source]

Mask RCNN network.

Parameters
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.

  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.

  • classes (iterable of str) – Names of categories, its length is num_class.

  • mask_channels (int, default is 256) – Number of channels in mask prediction

  • rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN. Upper bounded by min of rpn_test_pre_nms and rpn_test_post_nms.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 1000) – Return top proposal results after NMS in testing of RPN. Will be set to rpn_test_pre_nms if it is larger than rpn_test_pre_nms.

  • target_roi_scale (int, default 1) – Ratio of mask output roi / input roi. For model with FPN, this is typically 2.

  • num_fcn_convs (int, default 0) – number of convolution blocks before deconv layer. For FPN network this is typically 4.

hybrid_forward(F, x, gt_box=None)[source]

Forward Mask RCNN network.

The behavior during training and inference is different.

Parameters
  • x (mxnet.nd.NDArray or mxnet.symbol) – The network input tensor.

  • gt_box (type, only required during training) – The ground-truth bbox tensor with shape (1, N, 4).

Returns

During inference, returns final class id, confidence scores, bounding boxes, segmentation masks.

Return type

(ids, scores, bboxes, masks)

reset_class(classes, reuse_weights=None)[source]

Reset class categories and class predictors.

Parameters
  • classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.

  • reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.

Example

>>> net = gluoncv.model_zoo.get_model('mask_rcnn_resnet50_v1b_voc', pretrained=True)
>>> # use direct name to name mapping to reuse weights
>>> net.reset_class(classes=['person'], reuse_weights={'person':'person'})
>>> # or use interger mapping, person is the first category in COCO
>>> net.reset_class(classes=['person'], reuse_weights={0:0})
>>> # you can even mix them
>>> net.reset_class(classes=['person'], reuse_weights={'person':0})
>>> # or use a list of string if class name don't change
>>> net.reset_class(classes=['person'], reuse_weights=['person'])
class gluoncv.model_zoo.MobileNet(multiplier=1.0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.MobileNetV2(multiplier=1.0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper. :param multiplier: The width multiplier for controlling the model size. The actual number of channels

is equal to the original channel size multiplied by this multiplier.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.PSPNet(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]

Pyramid Scene Parsing Network

Parameters
  • nclass (int) – Number of categories for the training dataset.

  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).

  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).

  • aux (bool) – Auxiliary loss.

Reference:

Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. “Pyramid scene parsing network.” CVPR, 2017

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.RCNNTargetGenerator(num_class, means=(0.0, 0.0, 0.0, 0.0), stds=(0.1, 0.1, 0.2, 0.2))[source]

RCNN target encoder to generate matching target and regression target values.

Parameters
  • num_class (int) – Number of total number of positive classes.

  • means (iterable of float, default is (0., 0., 0., 0.)) – Mean values to be subtracted from regression targets.

  • stds (iterable of float, default is (1, 1, 2, 2)) – Standard deviations to be divided from regression targets.

forward(roi, samples, matches, gt_label, gt_box)[source]

Components can handle batch images

Parameters
  • roi ((B, N, 4), input proposals) –

  • samples ((B, N), value +1: positive / -1: negative.) –

  • matches ((B, N), value [0, M), index to gt_label and gt_box.) –

  • gt_label ((B, M), value [0, num_class), excluding background class.) –

  • gt_box ((B, M, 4), input ground truth box corner coordinates.) –

Returns

  • cls_target ((B, N), value [0, num_class + 1), including background.)

  • box_target ((B, N, C, 4), only foreground class has nonzero target.)

  • box_weight ((B, N, C, 4), only foreground class has nonzero weight.)

class gluoncv.model_zoo.RCNNTargetSampler(num_image, num_proposal, num_sample, pos_iou_thresh, pos_ratio, max_num_gt)[source]

A sampler to choose positive/negative samples from RCNN Proposals

Parameters
  • num_image (int) – Number of input images.

  • num_proposal (int) – Number of input proposals.

  • num_sample (int) – Number of samples for RCNN targets.

  • pos_iou_thresh (float) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples. Proposal whose IOU smaller than pos_iou_thresh is regarded as negative samples.

  • pos_ratio (float) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.

  • max_num_gt (int) – Maximum ground-truth number in whole training dataset. This is only an upper bound, not necessarily very precise. However, using a very big number may impact the training speed.

hybrid_forward(F, rois, scores, gt_boxes)[source]

Handle B=self._num_image by a for loop.

Parameters
  • rois ((B, self._num_input, 4) encoded in (x1, y1, x2, y2)) –

  • scores ((B, self._num_input, 1), value range [0, 1] with ignore value -1.) –

  • gt_boxes ((B, M, 4) encoded in (x1, y1, x2, y2), invalid box should have area of 0.) –

Returns

  • rois ((B, self._num_sample, 4), randomly drawn from proposals)

  • samples ((B, self._num_sample), value +1: positive / 0: ignore / -1: negative.)

  • matches ((B, self._num_sample), value between [0, M))

class gluoncv.model_zoo.ResNetV1(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
  • block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.

  • layers (list of int) – Numbers of layers in each block

  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.

  • classes (int, default 1000) – Number of classification classes.

  • thumbnail (bool, default False) – Enable thumbnail.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.ResNetV1b(block, layers, classes=1000, dilated=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, last_gamma=False, deep_stem=False, stem_width=32, avg_down=False, final_drop=0.0, use_global_stats=False, name_prefix='', **kwargs)[source]

Pre-trained ResNetV1b Model, which produces the strides of 8 featuremaps at conv5.

Parameters
  • block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.

  • layers (list of int) – Numbers of layers in each block

  • classes (int, default 1000) – Number of classification classes.

  • dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • deep_stem (bool, default False) – Whether to replace the 7x7 conv1 with 3 3x3 convolution layers.

  • avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.

  • final_drop (float, default 0.0) – Dropout ratio before the final classification layer.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

Reference:

  • He, Kaiming, et al. “Deep residual learning for image recognition.”

Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

  • Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.ResNetV2(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
  • block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.

  • layers (list of int) – Numbers of layers in each block

  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.

  • classes (int, default 1000) – Number of classification classes.

  • thumbnail (bool, default False) – Enable thumbnail.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.ResidualAttentionModel(scale, m, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper. Input size is 224 x 224.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_BasicBlockV1(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 18, 34 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_BasicBlockV2(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 18, 34 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_BottleneckV1(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 50, 101, 152 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_BottleneckV2(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 50, 101, 152 layers.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_ResNetV1(block, layers, channels, classes=1000, thumbnail=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SE_ResNetV2(block, layers, channels, classes=1000, thumbnail=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SSD(network, base_size, features, num_filters, sizes, ratios, steps, classes, use_1x1_transition=True, use_bn=True, reduce_ratio=1.0, min_depth=128, global_pool=False, pretrained=False, stds=(0.1, 0.1, 0.2, 0.2), nms_thresh=0.45, nms_topk=400, post_nms=100, anchor_alloc_size=128, ctx=cpu(0), norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.

Parameters
  • network (string or None) – Name of the base network, if None is used, will instantiate the base network from features directly instead of composing.

  • base_size (int) – Base input size, it is speficied so SSD can support dynamic input shapes.

  • features (list of str or mxnet.gluon.HybridBlock) – Intermediate features to be extracted or a network with multi-output. If network is None, features is expected to be a multi-output network.

  • num_filters (list of int) – Number of channels for the appended layers, ignored if network`is `None.

  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.

  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.

  • steps (list of int) – Step size of anchor boxes in each output layer.

  • classes (iterable of str) – Names of all categories.

  • use_1x1_transition (bool) – Whether to use 1x1 convolution as transition layer between attached layers, it is effective reducing model capacity.

  • use_bn (bool) – Whether to use BatchNorm layer after each attached convolutional layer.

  • reduce_ratio (float) – Channel reduce ratio (0, 1) of the transition layer.

  • min_depth (int) – Minimum channels for the transition layers.

  • global_pool (bool) – Whether to attach a global average pooling layer as the last output layer.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • stds (tuple of float, default is (0.1, 0.1, 0.2, 0.2)) – Std values to be divided/multiplied to box encoded values.

  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.

  • nms_topk (int, default is 400) –

    Apply NMS to top k detection results, use -1 to disable so that every Detection

    result is used in NMS.

  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

  • anchor_alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define anchor_alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, scalar, etc.

  • ctx (mx.Context) – Network context.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm. This will only apply to base networks that has norm_layer specified, will ignore if the base network (e.g. VGG) don’t accept this argument.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

hybrid_forward(F, x)[source]

Hybrid forward

property num_classes

Return number of foreground classes.

Returns

Number of foreground classes

Return type

int

reset_class(classes, reuse_weights=None)[source]

Reset class categories and class predictors.

Parameters
  • classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.

  • reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.

Example

>>> net = gluoncv.model_zoo.get_model('ssd_512_resnet50_v1_voc', pretrained=True)
>>> # use direct name to name mapping to reuse weights
>>> net.reset_class(classes=['person'], reuse_weights={'person':'person'})
>>> # or use interger mapping, person is the 14th category in VOC
>>> net.reset_class(classes=['person'], reuse_weights={0:14})
>>> # you can even mix them
>>> net.reset_class(classes=['person'], reuse_weights={'person':14})
>>> # or use a list of string if class name don't change
>>> net.reset_class(classes=['person'], reuse_weights=['person'])
set_nms(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]

Set non-maximum suppression parameters.

Parameters
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.

  • nms_topk (int, default is 400) –

    Apply NMS to top k detection results, use -1 to disable so that every Detection

    result is used in NMS.

  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

Returns

Return type

None

class gluoncv.model_zoo.SimplePoseResNet(base_name='resnet50_v1b', pretrained_base=False, pretrained_ctx=cpu(0), num_joints=17, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), final_conv_kernel=1, deconv_with_bias=False, **kwargs)[source]
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.SqueezeNet(version, classes=1000, **kwargs)[source]

SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.

Parameters
  • version (str) – Version of squeezenet. Options are ‘1.0’, ‘1.1’.

  • classes (int, default 1000) – Number of classification classes.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.VGG(layers, filters, classes=1000, batch_norm=False, **kwargs)[source]

VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • layers (list of int) – Numbers of layers in each feature block.

  • filters (list of int) – Numbers of filters in each feature block. List length should match the layers.

  • classes (int, default 1000) – Number of classification classes.

  • batch_norm (bool, default False) – Use batch normalization.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.VGGAtrousExtractor(layers, filters, extras, batch_norm=False, **kwargs)[source]

VGG Atrous multi layer feature extractor which produces multiple output feature maps.

Parameters
  • layers (list of int) – Number of layer for vgg base network.

  • filters (list of int) – Number of convolution filters for each layer.

  • extras (list of list) – Extra layers configurations.

  • batch_norm (bool) – If True, will use BatchNorm layers.

hybrid_forward(F, x, init_scale)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.Xception65(classes=1000, output_stride=32, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None)[source]

Modified Aligned Xception

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.Xception71(classes=1000, output_stride=32, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None)[source]

Modified Aligned Xception

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.model_zoo.YOLOV3(stages, channels, anchors, strides, classes, alloc_size=(128, 128), nms_thresh=0.45, nms_topk=400, post_nms=100, pos_iou_thresh=1.0, ignore_iou_thresh=0.7, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO V3 detection network. Reference: https://arxiv.org/pdf/1804.02767.pdf. :param stages: Staged feature extraction blocks.

For example, 3 stages and 3 YOLO output layers are used original paper.

Parameters
  • channels (iterable) – Number of conv channels for each appended stage. len(channels) should match len(stages).

  • num_class (int) – Number of foreground objects.

  • anchors (iterable) – The anchor setting. len(anchors) should match len(stages).

  • strides (iterable) – Strides of feature map. len(strides) should match len(stages).

  • alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, Scalar, etc.

  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.

  • nms_topk (int, default is 400) –

    Apply NMS to top k detection results, use -1 to disable so that every Detection

    result is used in NMS.

  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

  • pos_iou_thresh (float, default is 1.0) – IOU threshold for true anchors that match real objects. ‘pos_iou_thresh < 1’ is not implemented.

  • ignore_iou_thresh (float) – Anchors that has IOU in range(ignore_iou_thresh, pos_iou_thresh) don’t get penalized of objectness score.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

property classes

Return names of (non-background) categories. :returns: Names of (non-background) categories. :rtype: iterable of str

hybrid_forward(F, x, *args)[source]

YOLOV3 network hybrid forward. :param F: F is mxnet.sym if hybridized or mxnet.nd if not. :type F: mxnet.nd or mxnet.sym :param x: Input data. :type x: mxnet.nd.NDArray :param *args: During training, extra inputs are required:

(gt_boxes, obj_t, centers_t, scales_t, weights_t, clas_t) These are generated by YOLOV3PrefetchTargetGenerator in dataloader transform function.

Returns

During inference, return detections in shape (B, N, 6) with format (cid, score, xmin, ymin, xmax, ymax) During training, return losses only: (obj_loss, center_loss, scale_loss, cls_loss).

Return type

(tuple of) mxnet.nd.NDArray

property num_class

Number of (non-background) categories. :returns: Number of (non-background) categories. :rtype: int

reset_class(classes, reuse_weights=None)[source]

Reset class categories and class predictors. :param classes: The new categories. [‘apple’, ‘orange’] for example. :type classes: iterable of str :param reuse_weights: A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict,

or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.

Example

>>> net = gluoncv.model_zoo.get_model('yolo3_darknet53_voc', pretrained=True)
>>> # use direct name to name mapping to reuse weights
>>> net.reset_class(classes=['person'], reuse_weights={'person':'person'})
>>> # or use interger mapping, person is the 14th category in VOC
>>> net.reset_class(classes=['person'], reuse_weights={0:14})
>>> # you can even mix them
>>> net.reset_class(classes=['person'], reuse_weights={'person':14})
>>> # or use a list of string if class name don't change
>>> net.reset_class(classes=['person'], reuse_weights=['person'])
set_nms(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]

Set non-maximum suppression parameters. :param nms_thresh: Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS. :type nms_thresh: float, default is 0.45. :param nms_topk:

Apply NMS to top k detection results, use -1 to disable so that every Detection

result is used in NMS.

Parameters

post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

Returns

Return type

None

gluoncv.model_zoo.alexnet(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

AlexNet model from the “One weird trick…” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.

class gluoncv.model_zoo.cifar_ResidualAttentionModel(scale, m, classes=10, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper. Input size is 32 x 32.

Parameters
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

gluoncv.model_zoo.cifar_residualattentionnet452(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.cifar_residualattentionnet56(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.cifar_residualattentionnet92(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.cifar_resnet110_v1(**kwargs)[source]

ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.cifar_resnet110_v2(**kwargs)[source]

ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.cifar_resnet20_v1(**kwargs)[source]

ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.cifar_resnet20_v2(**kwargs)[source]

ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.cifar_resnet56_v1(**kwargs)[source]

ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.cifar_resnet56_v2(**kwargs)[source]

ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.cifar_wideresnet16_10(**kwargs)[source]

WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters
gluoncv.model_zoo.cifar_wideresnet28_10(**kwargs)[source]

WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters
gluoncv.model_zoo.cifar_wideresnet40_8(**kwargs)[source]

WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters
gluoncv.model_zoo.darknet53(**kwargs)[source]

Darknet v3 53 layer network. Reference: https://arxiv.org/pdf/1804.02767.pdf.

Parameters
Returns

Darknet network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.densenet121(**kwargs)[source]

Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters
gluoncv.model_zoo.densenet161(**kwargs)[source]

Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters
gluoncv.model_zoo.densenet169(**kwargs)[source]

Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters
gluoncv.model_zoo.densenet201(**kwargs)[source]

Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters
gluoncv.model_zoo.faster_rcnn_fpn_bn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]

Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.

  • num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_fpn_bn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_fpn_resnet101_v1d_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_fpn_resnet101_v1d_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_fpn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_fpn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet101_v1d_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters
  • pretrained (bool, optional, default is False) – Load pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet101_v1d_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet101_v1d_custom(classes, transfer=None, pretrained_base=True, pretrained=False, **kwargs)[source]

Faster RCNN model with resnet101_v1d base network on custom dataset.

Parameters
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.

  • transfer (str or None) – If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other datasets.

  • pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Returns

Hybrid faster RCNN network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.faster_rcnn_resnet101_v1d_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters
  • pretrained (bool, optional, default is False) – Load pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet101_v1d_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet50_v1b_custom(classes, transfer=None, pretrained_base=True, pretrained=False, **kwargs)[source]

Faster RCNN model with resnet50_v1b base network on custom dataset.

Parameters
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.

  • transfer (str or None) – If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other datasets.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Returns

Hybrid faster RCNN network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.faster_rcnn_resnet50_v1b_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v1b_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_cifar_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.get_cifar_wide_resnet(num_layers, width_factor=1, drop_rate=0.0, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
  • num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.

  • width_factor (int) – The width factor to apply to the number of channels from the original resnet.

  • drop_rate (float) – The rate of dropout.

  • pretrained (bool, default False) – Whether to load the pretrained weights for model.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_darknet(darknet_version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get darknet by version and num_layers info.

Parameters
  • darknet_version (str) – Darknet version, choices are [‘v3’].

  • num_layers (int) – Number of layers.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

Returns

Darknet network.

Return type

mxnet.gluon.HybridBlock

Examples

>>> model = get_darknet('v3', 53, pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

DeepLabV3 :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_deeplab_plus(dataset='pascal_voc', backbone='xception', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

DeepLabV3Plus :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='xception', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_deeplab_plus_xception_coco(**kwargs)[source]

DeepLabV3Plus :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_plus_xception_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_ade(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet101_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_coco(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_voc(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet152_coco(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet152_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet152_voc(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet152_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet50_ade(**kwargs)[source]

DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_deeplab_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_faster_rcnn(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Utility function to return faster rcnn networks.

Parameters
  • name (str) – Model name.

  • dataset (str) – The name of dataset.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).

  • root (str) – Model weights storing path.

Returns

The Faster-RCNN network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]

FCN model from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • dataset (str, default pascal_voc) – The dataset that model pretrained on. (pascal_voc, ade20k)

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • pretrained_base (bool or str, default True) – This will load pretrained backbone network, that was trained on ImageNet.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_ade(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_coco(**kwargs)[source]

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_voc(**kwargs)[source]

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet50_ade(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet50_voc(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_mask_rcnn(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Utility function to return mask rcnn networks.

Parameters
  • name (str) – Model name.

  • dataset (str) – The name of dataset.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).

  • root (str) – Model weights storing path.

Returns

The Mask RCNN network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_mobilenet(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.

Parameters
  • multiplier (float) – The width multiplier for controlling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_mobilenet_v2(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters
  • multiplier (float) – The width multiplier for controlling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_model(name, **kwargs)[source]

Returns a pre-defined model by name

Parameters
  • name (str) – Name of the model.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • classes (int) – Number of classes for the output layer.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Returns

The model.

Return type

HybridBlock

gluoncv.model_zoo.get_model_list()[source]

Get the entire list of model names in model_zoo.

Returns

Entire list of model names in model_zoo.

Return type

list of str

gluoncv.model_zoo.get_nasnet(repeat=6, penultimate_filters=4032, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters
  • repeat (int) – Number of cell repeats

  • penultimate_filters (int) – Number of filters in the penultimate layer of the network

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_psp(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]

Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • pretrained_base (bool or str, default True) – This will load pretrained backbone network, that was trained on ImageNet.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_ade(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_psp_resnet101_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_citys(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_psp_resnet101_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_coco(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_psp_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_voc(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_psp_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet50_ade(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_psp_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', use_se=False, **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
  • version (int) – Version of ResNet. Options are 1, 2.

  • num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.

  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_se_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
  • version (int) – Version of ResNet. Options are 1, 2.

  • num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.get_ssd(name, base_size, features, filters, sizes, ratios, steps, classes, dataset, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get SSD models.

Parameters
  • name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.

  • base_size (int) – Base image size for training, this is fixed once training is assigned. A fixed base size still allows you to have variable input size during test.

  • features (iterable of str or HybridBlock) – List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate multiple outputs for prediction.

  • filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.

  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.

  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.

  • steps (list of int) – Step size of anchor boxes in each output layer.

  • classes (iterable of str) – Names of categories.

  • dataset (str) – Name of dataset. This is used to identify model name because models trained on different datasets are going to be very different.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).

  • root (str) – Model weights storing path.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.get_vgg(num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • num_layers (int) – Number of layers for the variant of densenet. Options are 11, 13, 16, 19.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.

gluoncv.model_zoo.get_vgg_atrous_extractor(num_layers, im_size, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get VGG atrous feature extractor networks.

Parameters
  • num_layers (int) – VGG types, can be 11,13,16,19.

  • im_size (int) – VGG detection input size, can be 300, 512.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (mx.Context) – Context such as mx.cpu(), mx.gpu(0).

  • root (str) – Model weights storing path.

Returns

The returned network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_xcetption(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Xception model from

Parameters
gluoncv.model_zoo.get_xcetption_71(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Xception model from

Parameters
gluoncv.model_zoo.get_yolov3(name, stages, filters, anchors, strides, classes, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get YOLOV3 models. :param name: Model name, if None is used, you must specify features to be a HybridBlock. :type name: str or None :param stages: List of network internal output names, in order to specify which layers are

used for predicting bbox values. If name is None, features must be a HybridBlock which generate multiple outputs for prediction.

Parameters
  • filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.

  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.

  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.

  • steps (list of int) – Step size of anchor boxes in each output layer.

  • classes (iterable of str) – Names of categories.

  • dataset (str) – Name of dataset. This is used to identify model name because models trained on different datasets are going to be very different.

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).

  • root (str) – Model weights storing path.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

Returns

A YOLOV3 detection network.

Return type

HybridBlock

gluoncv.model_zoo.inception_v3(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.

Parameters
gluoncv.model_zoo.mask_rcnn_fpn_bn_mobilenet1_0_coco(pretrained=False, pretrained_base=True, num_devices=0, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.

  • rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_fpn_bn_mobilenet1_0_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_fpn_bn_resnet18_v1b_coco(pretrained=False, pretrained_base=True, num_devices=0, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.

  • rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_fpn_resnet18_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_fpn_resnet101_v1d_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_fpn_resnet101_v1d_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_fpn_resnet18_v1b_coco(pretrained=False, pretrained_base=True, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_fpn_resnet18_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_fpn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_resnet101_v1d_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_resnet101_v1d_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_resnet18_v1b_coco(pretrained=False, pretrained_base=True, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.

  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.

  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_resnet18_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mask_rcnn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mobilenet0_25(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.25.

Parameters
gluoncv.model_zoo.mobilenet0_5(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.5.

Parameters
gluoncv.model_zoo.mobilenet0_75(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.75.

Parameters
gluoncv.model_zoo.mobilenet1_0(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 1.0.

Parameters
gluoncv.model_zoo.mobilenet_v2_0_25(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters
gluoncv.model_zoo.mobilenet_v2_0_5(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters
gluoncv.model_zoo.mobilenet_v2_0_75(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters
gluoncv.model_zoo.mobilenet_v2_1_0(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters
gluoncv.model_zoo.nasnet_4_1056(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters
  • repeat (int) – Number of cell repeats

  • penultimate_filters (int) – Number of filters in the penultimate layer of the network

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.nasnet_5_1538(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters
  • repeat (int) – Number of cell repeats

  • penultimate_filters (int) – Number of filters in the penultimate layer of the network

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.nasnet_6_4032(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters
  • repeat (int) – Number of cell repeats

  • penultimate_filters (int) – Number of filters in the penultimate layer of the network

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.nasnet_7_1920(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters
  • repeat (int) – Number of cell repeats

  • penultimate_filters (int) – Number of filters in the penultimate layer of the network

  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.residualattentionnet128(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet164(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet200(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet236(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet452(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet56(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.residualattentionnet92(**kwargs)[source]

AttentionModel model from “Residual Attention Network for Image Classification” paper.

Parameters
gluoncv.model_zoo.resnet101_v1(**kwargs)[source]

ResNet-101 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.resnet101_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-101 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet101_v1b_gn(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-50 GroupNorm model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet101_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1c-101 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet101_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-101 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet101_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet101_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-101 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet101_v2(**kwargs)[source]

ResNet-101 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.resnet152_v1(**kwargs)[source]

ResNet-152 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.resnet152_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-152 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet152_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1c-152 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet152_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-152 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet152_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet152_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-152 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet152_v2(**kwargs)[source]

ResNet-152 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.resnet18_v1(**kwargs)[source]

ResNet-18 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.resnet18_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-18 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet18_v2(**kwargs)[source]

ResNet-18 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.resnet34_v1(**kwargs)[source]

ResNet-34 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.resnet34_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-34 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet34_v2(**kwargs)[source]

ResNet-34 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.resnet50_v1(**kwargs)[source]

ResNet-50 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters
gluoncv.model_zoo.resnet50_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm) Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet50_v1b_gn(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-50 GroupNorm model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.

  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

gluoncv.model_zoo.resnet50_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1c-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet50_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet50_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet50_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-50 model.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.

  • norm_layer (object) – Normalization layer used (default: mxnet.gluon.nn.BatchNorm). Can be mxnet.gluon.nn.BatchNorm or mxnet.gluon.contrib.nn.SyncBatchNorm.

gluoncv.model_zoo.resnet50_v2(**kwargs)[source]

ResNet-50 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet101_v1(**kwargs)[source]

SE-ResNet-101 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet101_v2(**kwargs)[source]

SE-ResNet-101 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet152_v1(**kwargs)[source]

SE-ResNet-152 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet152_v2(**kwargs)[source]

SE-ResNet-152 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet18_v1(**kwargs)[source]

SE-ResNet-18 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet18_v2(**kwargs)[source]

SE-ResNet-18 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet34_v1(**kwargs)[source]

SE-ResNet-34 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet34_v2(**kwargs)[source]

SE-ResNet-34 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet50_v1(**kwargs)[source]

SE-ResNet-50 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.se_resnet50_v2(**kwargs)[source]

SE-ResNet-50 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters
gluoncv.model_zoo.simple_pose_resnet101_v1b(**kwargs)[source]

ResNet-101 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet101_v1d(**kwargs)[source]

ResNet-101-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet152_v1b(**kwargs)[source]

ResNet-152 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet152_v1d(**kwargs)[source]

ResNet-152-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet18_v1b(**kwargs)[source]

ResNet-18 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet50_v1b(**kwargs)[source]

ResNet-50 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.simple_pose_resnet50_v1d(**kwargs)[source]

ResNet-50-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.squeezenet1_0(**kwargs)[source]

SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.squeezenet1_1(**kwargs)[source]

SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.ssd_300_mobilenet0_25_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet0.25 base networks for COCO.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_300_mobilenet0_25_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with mobilenet0.25 300 base network for custom dataset.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_300_mobilenet0_25_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_300_mobilenet0_25_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_300_mobilenet0_25_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet0.25 base networks.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_300_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_300_vgg16_atrous_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.

Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_300_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_300_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
gluoncv.model_zoo.ssd_300_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks for COCO.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with mobilenet1.0 512 base network for custom dataset.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_512_mobilenet1_0_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_mobilenet1_0_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_mobilenet1_0_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet101_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 101 layers.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet152_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 152 layers.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet18_v1_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 18 layers.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet18_v1_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with ResNet18 v1 512 base network for COCO.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_512_resnet18_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_resnet18_v1_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_resnet18_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 18 layers.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers for COCO.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with ResNet50 v1 512 base network for custom dataset.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_512_resnet50_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_resnet50_v1_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_resnet50_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers.

Parameters
Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous layers for COCO.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_custom(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.

Returns

A SSD detection network.

Return type

HybridBlock

Example

>>> net = ssd_512_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
gluoncv.model_zoo.ssd_512_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 512x512 base network.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.

Returns

A SSD detection network.

Return type

HybridBlock

gluoncv.model_zoo.vgg11(**kwargs)[source]

VGG-11 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg11_bn(**kwargs)[source]

VGG-11 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg13(**kwargs)[source]

VGG-13 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg13_bn(**kwargs)[source]

VGG-13 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg16(**kwargs)[source]

VGG-16 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg16_atrous_300(**kwargs)[source]

Get VGG atrous 16 layer 300 in_size feature extractor networks.

gluoncv.model_zoo.vgg16_atrous_512(**kwargs)[source]

Get VGG atrous 16 layer 512 in_size feature extractor networks.

gluoncv.model_zoo.vgg16_bn(**kwargs)[source]

VGG-16 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg19(**kwargs)[source]

VGG-19 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.vgg19_bn(**kwargs)[source]

VGG-19 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters
  • pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.

  • ctx (Context, default CPU) – The context in which to load the pretrained weights.

  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.

gluoncv.model_zoo.yolo3_darknet53_coco(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on COCO dataset. :param pretrained_base: Whether fetch and load pretrained weights for base network. :type pretrained_base: boolean :param pretrained: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_darknet53_custom(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other

datasets.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_darknet53_voc(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet0_25_coco(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet0.25 base network on COCO dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet0_25_custom(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet0.25 base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other

datasets.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet0_25_voc(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet0.25 base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet1_0_coco(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet base network on COCO dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet1_0_custom(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other

datasets.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_mobilenet1_0_voc(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]

YOLO3 multi-scale with mobilenet base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.

String value represents the hashtag for a certain version of pretrained weights.

Parameters
Returns

Fully hybrid yolo3 network.

Return type

mxnet.gluon.HybridBlock