gluoncv.model_zoo

GluonCV Model Zoo

gluoncv.model_zoo.get_model

Returns a pre-defined GluonCV model by name.

Hint

This is the recommended method for getting a pre-defined model.

It support directly loading models from Gluon Model Zoo as well.

get_model Returns a pre-defined model by name

Image Classification

CIFAR

get_cifar_resnet ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper..
cifar_resnet20_v1 ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet56_v1 ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet110_v1 ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet20_v2 ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
cifar_resnet56_v2 ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
cifar_resnet110_v2 ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
get_cifar_wide_resnet ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper..
cifar_wideresnet16_10 WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.
cifar_wideresnet28_10 WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.
cifar_wideresnet40_8 WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

ImageNet

We apply dilattion strategy to pre-trained ResNet models (with stride of 8). Please see gluoncv.model_zoo.SegBaseModel for how to use it.

ResNetV1b Pre-trained ResNetV1b Model, which preduces the strides of 8 featuremaps at conv5.
resnet18_v1b Constructs a ResNetV1b-18 model.
resnet34_v1b Constructs a ResNetV1b-34 model.
resnet50_v1b Constructs a ResNetV1b-50 model.
resnet101_v1b Constructs a ResNetV1b-101 model.
resnet152_v1b Constructs a ResNetV1b-152 model.

Object Detection

SSD

SSD Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.
get_ssd Get SSD models.
ssd_300_vgg16_atrous_voc SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.
ssd_300_vgg16_atrous_coco SSD architecture with VGG16 atrous 300x300 base network for COCO.
ssd_300_vgg16_atrous_custom SSD architecture with VGG16 atrous 300x300 base network for COCO.
ssd_512_vgg16_atrous_voc SSD architecture with VGG16 atrous 512x512 base network.
ssd_512_vgg16_atrous_coco SSD architecture with VGG16 atrous layers for COCO.
ssd_512_vgg16_atrous_custom SSD architecture with VGG16 atrous 300x300 base network for COCO.
ssd_512_resnet50_v1_voc SSD architecture with ResNet v1 50 layers.
ssd_512_resnet50_v1_coco SSD architecture with ResNet v1 50 layers for COCO.
ssd_512_resnet50_v1_custom SSD architecture with ResNet50 v1 512 base network for custom dataset.
ssd_512_resnet101_v2_voc SSD architecture with ResNet v2 101 layers.
ssd_512_resnet152_v2_voc SSD architecture with ResNet v2 152 layers.
VGGAtrousExtractor VGG Atrous multi layer feature extractor which produces multiple output feauture maps.
get_vgg_atrous_extractor Get VGG atrous feature extractor networks.
vgg16_atrous_300 Get VGG atrous 16 layer 300 in_size feature extractor networks.
vgg16_atrous_512 Get VGG atrous 16 layer 512 in_size feature extractor networks.

Faster RCNN

FasterRCNN Faster RCNN network.
get_faster_rcnn Utility function to return faster rcnn networks.
faster_rcnn_resnet50_v1b_voc Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.
faster_rcnn_resnet50_v1b_coco Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.
faster_rcnn_resnet50_v1b_custom Faster RCNN model with resnet50_v1b base network on custom dataset.

YOLOv3

YOLOV3 YOLO V3 detection network.
get_yolov3 Get YOLOV3 models.
yolo3_darknet53_voc YOLO3 multi-scale with darknet53 base network on VOC dataset.
yolo3_darknet53_coco YOLO3 multi-scale with darknet53 base network on COCO dataset.
yolo3_darknet53_custom YOLO3 multi-scale with darknet53 base network on custom dataset.

Instance Segmentation

Mask RCNN

MaskRCNN Mask RCNN network.
get_mask_rcnn Utility function to return mask rcnn networks.
mask_rcnn_resnet50_v1b_coco Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R.

Semantic Segmentation

FCN

FCN Fully Convolutional Networks for Semantic Segmentation
get_fcn FCN model from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_resnet50_voc FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_resnet101_voc FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_resnet101_coco FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_resnet50_ade FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_resnet101_ade FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

PSPNet

PSPNet Pyramid Scene Parsing Network
get_psp Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on.
get_psp_resnet101_coco Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model.
get_psp_resnet101_voc Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model.
get_psp_resnet50_ade Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model.
get_psp_resnet101_ade Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model.

DeepLabV3

DeepLabV3
param nclass:Number of categories for the training dataset.
get_deeplab DeepLabV3 :param dataset: The dataset that model pretrained on.
get_deeplab_resnet101_coco DeepLabV3 :param pretrained: Whether to load the pretrained weights for model.
get_deeplab_resnet101_voc DeepLabV3 :param pretrained: Whether to load the pretrained weights for model.
get_deeplab_resnet50_ade DeepLabV3 :param pretrained: Whether to load the pretrained weights for model.
get_deeplab_resnet101_ade DeepLabV3 :param pretrained: Whether to load the pretrained weights for model.

API Reference

Network definitions of GluonCV models

GluonCV Model Zoo

class gluoncv.model_zoo.AlexNet(classes=1000, **kwargs)[source]

AlexNet model from the “One weird trick…” paper.

Parameters:classes (int, default 1000) – Number of classes for the output layer.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BasicBlockV1(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, **kwargs)[source]

BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BasicBlockV1b(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs={}, **kwargs)[source]

ResNetV1b BasicBlockV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BasicBlockV2(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, **kwargs)[source]

BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BottleneckV1(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, **kwargs)[source]

Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BottleneckV1b(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs={}, last_gamma=False, **kwargs)[source]

ResNetV1b BottleneckV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BottleneckV2(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, **kwargs)[source]

Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.DarknetV3(layers, channels, classes=1000, num_sync_bn_devices=-1, **kwargs)[source]

Darknet v3.

Parameters:
  • layers (iterable) – Description of parameter layers.
  • channels (iterable) – Description of parameter channels.
  • classes (int, default is 1000) – Number of classes, which determines the dense layer output channels.
  • num_sync_bn_devices (int, default is -1) – Number of devices for training. If num_sync_bn_devices < 2, SyncBatchNorm is disabled.
features

mxnet.gluon.nn.HybridSequential – Feature extraction layers.

output

mxnet.gluon.nn.Dense – A classes(1000)-way Fully-Connected Layer.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.DeepLabV3(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]
Parameters:
  • nclass (int) – Number of categories for the training dataset.
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).
  • aux (bool) – Auxilary loss.

Reference:

Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.DenseNet(num_init_features, growth_rate, block_config, bn_size=4, dropout=0, classes=1000, **kwargs)[source]

Densenet-BC model from the “Densely Connected Convolutional Networks” paper.

Parameters:
  • num_init_features (int) – Number of filters to learn in the first convolution layer.
  • growth_rate (int) – Number of filters to add each layer (k in the paper).
  • block_config (list of int) – List of integers for numbers of layers in each pooling block.
  • bn_size (int, default 4) – Multiplicative factor for number of bottle neck layers. (i.e. bn_size * k features in the bottleneck layer)
  • dropout (float, default 0) – Rate of dropout after each dense layer.
  • classes (int, default 1000) – Number of classification classes.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.FCN(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]

Fully Convolutional Networks for Semantic Segmentation

Parameters:
  • nclass (int) – Number of categories for the training dataset.
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm;
  • pretrained_base (bool) – Refers to if the FCN backbone or the encoder is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.

Reference:

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” CVPR, 2015

Examples

>>> model = FCN(nclass=21, backbone='resnet50')
>>> print(model)
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.FasterRCNN(features, top_features, classes, short=600, max_size=1000, train_patterns=None, nms_thresh=0.3, nms_topk=400, post_nms=100, roi_mode='align', roi_size=(14, 14), stride=16, clip=None, rpn_channel=1024, base_size=16, scales=(8, 16, 32), ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7, rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000, rpn_test_post_nms=300, rpn_min_size=16, num_sample=128, pos_iou_thresh=0.5, pos_ratio=0.25, additional_output=False, **kwargs)[source]

Faster RCNN network.

Parameters:
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
  • classes (iterable of str) – Names of categories, its length is num_class.
  • short (int, default is 600.) – Input image short side size.
  • max_size (int, default is 1000.) – Maximum size of input image long side.
  • train_patterns (str, default is None.) – Matching pattern for trainable parameters.
  • nms_thresh (float, default is 0.3.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
  • roi_mode (str, default is align) – ROI pooling mode. Currently support ‘pool’ and ‘align’.
  • roi_size (tuple of int, length 2, default is (14, 14)) – (height, width) of the ROI region.
  • stride (int, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size.
  • clip (float, default is None) – Clip bounding box target to this value.
  • rpn_channel (int, default is 1024) – Channel number used in RPN convolutional layers.
  • base_size (int) – The width(and height) of reference anchor box.
  • scales (iterable of float, default is (8, 16, 32)) –

    The areas of anchor boxes. We use the following form to compute the shapes of anchors:

    \[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]
  • ratios (iterable of float, default is (0.5, 1, 2)) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.
  • alloc_size (tuple of int) – Allocate size for the anchor boxes as (H, W). Usually we generate enough anchors for large feature map, e.g. 128x128. Later in inference we can have variable input sizes, at which time we can crop corresponding anchors from this large anchor map so we can skip re-generating anchors for each input.
  • rpn_train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training of RPN.
  • rpn_train_post_nms (int, default is 2000) – Return top proposal results after NMS in training of RPN.
  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
  • rpn_nms_thresh (float, default is 0.7) – IOU threshold for NMS. It is used to remove overlapping proposals.
  • train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training.
  • train_post_nms (int, default is 2000) – Return top proposal results after NMS in training.
  • test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing.
  • test_post_nms (int, default is 300) – Return top proposal results after NMS in testing.
  • rpn_min_size (int, default is 16) – Proposals whose size is smaller than min_size will be discarded.
  • num_sample (int, default is 128) – Number of samples for RCNN targets.
  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.
  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.
  • additional_output (boolean, default is False) – additional_output is only used for Mask R-CNN to get internal outputs.
classes

iterable of str – Names of categories, its length is num_class.

num_class

int – Number of positive categories.

short

int – Input image short side size.

max_size

int – Maximum size of input image long side.

train_patterns

str – Matching pattern for trainable parameters.

nms_thresh

float – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.

nms_topk

int

Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms

int – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.

target_generator

gluon.Block – Generate training targets with boxes, samples, matches, gt_label and gt_box.

hybrid_forward(F, x, gt_box=None)[source]

Forward Faster-RCNN network.

The behavior during traing and inference is different.

Parameters:
  • x (mxnet.nd.NDArray or mxnet.symbol) – The network input tensor.
  • gt_box (type, only required during training) – The ground-truth bbox tensor with shape (1, N, 4).
Returns:

During inference, returns final class id, confidence scores, bounding boxes.

Return type:

(ids, scores, bboxes)

reset_class(classes)[source]

Reset class categories and class predictors.

Parameters:classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
target_generator

Returns stored target generator

Returns:The RCNN target generator
Return type:mxnet.gluon.HybridBlock
class gluoncv.model_zoo.Inception3(classes=1000, **kwargs)[source]

Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.

Parameters:classes (int, default 1000) – Number of classification classes.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.MaskRCNN(features, top_features, classes, mask_channels=256, rcnn_max_dets=1000, **kwargs)[source]

Mask RCNN network.

Parameters:
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
  • classes (iterable of str) – Names of categories, its length is num_class.
  • mask_channels (int, default is 256) – Number of channels in mask prediction
hybrid_forward(F, x, gt_box=None)[source]

Forward Mask RCNN network.

The behavior during traing and inference is different.

Parameters:
  • x (mxnet.nd.NDArray or mxnet.symbol) – The network input tensor.
  • gt_box (type, only required during training) – The ground-truth bbox tensor with shape (1, N, 4).
Returns:

During inference, returns final class id, confidence scores, bounding boxes, segmentation masks.

Return type:

(ids, scores, bboxes, masks)

class gluoncv.model_zoo.MobileNet(multiplier=1.0, classes=1000, **kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.

Parameters:
  • multiplier (float, default 1.0) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
  • classes (int, default 1000) – Number of classes for the output layer.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.MobileNetV2(multiplier=1.0, classes=1000, **kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • multiplier (float, default 1.0) – The width multiplier for controling the model size. The actual number of channels is equal to the original channel size multiplied by this multiplier.
  • classes (int, default 1000) – Number of classes for the output layer.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.PSPNet(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]

Pyramid Scene Parsing Network

Parameters:
  • nclass (int) – Number of categories for the training dataset.
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).
  • aux (bool) – Auxilary loss.

Reference:

Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. “Pyramid scene parsing network.” CVPR, 2017
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.ResNetV1(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.ResNetV1b(block, layers, classes=1000, dilated=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs={'use_global_stats': True}, last_gamma=False, deep_stem=False, stem_width=32, avg_down=False, final_drop=0.0, use_global_stats=False, **kwargs)[source]

Pre-trained ResNetV1b Model, which preduces the strides of 8 featuremaps at conv5.

Parameters:
  • block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • classes (int, default 1000) – Number of classification classes.
  • dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • deep_stem (bool, default False) – Whether to replace the 7x7 conv1 with 3 3x3 convolution layers.
  • avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.
  • final_drop (float, default 0.0) – Dropout ratio before the final classification layer.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.

Reference:

  • He, Kaiming, et al. “Deep residual learning for image recognition.”

Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

  • Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.ResNetV2(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, **kwargs)[source]

ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BasicBlockV1(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BasicBlockV2(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BottleneckV1(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BottleneckV2(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_ResNetV1(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_ResNetV2(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]

SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SSD(network, base_size, features, num_filters, sizes, ratios, steps, classes, use_1x1_transition=True, use_bn=True, reduce_ratio=1.0, min_depth=128, global_pool=False, pretrained=False, stds=(0.1, 0.1, 0.2, 0.2), nms_thresh=0.45, nms_topk=400, post_nms=100, anchor_alloc_size=128, ctx=cpu(0), **kwargs)[source]

Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.

Parameters:
  • network (string or None) – Name of the base network, if None is used, will instantiate the base network from features directly instead of composing.
  • base_size (int) – Base input size, it is speficied so SSD can support dynamic input shapes.
  • features (list of str or mxnet.gluon.HybridBlock) – Intermediate features to be extracted or a network with multi-output. If network is None, features is expected to be a multi-output network.
  • num_filters (list of int) – Number of channels for the appended layers, ignored if network`is `None.
  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.
  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
  • steps (list of int) – Step size of anchor boxes in each output layer.
  • classes (iterable of str) – Names of all categories.
  • use_1x1_transition (bool) – Whether to use 1x1 convolution as transition layer between attached layers, it is effective reducing model capacity.
  • use_bn (bool) – Whether to use BatchNorm layer after each attached convolutional layer.
  • reduce_ratio (float) – Channel reduce ratio (0, 1) of the transition layer.
  • min_depth (int) – Minimum channels for the transition layers.
  • global_pool (bool) – Whether to attach a global average pooling layer as the last output layer.
  • pretrained (bool) – Description of parameter pretrained.
  • stds (tuple of float, default is (0.1, 0.1, 0.2, 0.2)) – Std values to be divided/multiplied to box encoded values.
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
  • anchor_alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define anchor_alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, scalar, etc.
  • ctx (mx.Context) – Network context.
hybrid_forward(F, x)[source]

Hybrid forward

num_classes

Return number of foreground classes.

Returns:Number of foreground classes
Return type:int
reset_class(classes)[source]

Reset class categories and class predictors.

Parameters:classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
set_nms(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]

Set non-maximum suppression parameters.

Parameters:
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
Returns:

Return type:

None

class gluoncv.model_zoo.SqueezeNet(version, classes=1000, **kwargs)[source]

SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.

Parameters:
  • version (str) – Version of squeezenet. Options are ‘1.0’, ‘1.1’.
  • classes (int, default 1000) – Number of classification classes.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.VGG(layers, filters, classes=1000, batch_norm=False, **kwargs)[source]

VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • layers (list of int) – Numbers of layers in each feature block.
  • filters (list of int) – Numbers of filters in each feature block. List length should match the layers.
  • classes (int, default 1000) – Number of classification classes.
  • batch_norm (bool, default False) – Use batch normalization.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.VGGAtrousExtractor(layers, filters, extras, batch_norm=False, **kwargs)[source]

VGG Atrous multi layer feature extractor which produces multiple output feauture maps.

Parameters:
  • layers (list of int) – Number of layer for vgg base network.
  • filters (list of int) – Number of convolution filters for each layer.
  • extras (list of list) – Extra layers configurations.
  • batch_norm (bool) – If True, will use BatchNorm layers.
hybrid_forward(F, x, init_scale)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.YOLOV3(stages, channels, anchors, strides, classes, alloc_size=(128, 128), nms_thresh=0.45, nms_topk=400, post_nms=100, pos_iou_thresh=1.0, ignore_iou_thresh=0.7, num_sync_bn_devices=-1, **kwargs)[source]

YOLO V3 detection network. Reference: https://arxiv.org/pdf/1804.02767.pdf.

Parameters:
  • stages (mxnet.gluon.HybridBlock) – Staged feature extraction blocks. For example, 3 stages and 3 YOLO output layers are used original paper.
  • channels (iterable) – Number of conv channels for each appended stage. len(channels) should match len(stages).
  • num_class (int) – Number of foreground objects.
  • anchors (iterable) – The anchor setting. len(anchors) should match len(stages).
  • strides (iterable) – Strides of feature map. len(strides) should match len(stages).
  • alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, Scalar, etc.
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
  • pos_iou_thresh (float, default is 1.0) – IOU threshold for true anchors that match real objects. ‘pos_iou_thresh < 1’ is not implemented.
  • ignore_iou_thresh (float) – Anchors that has IOU in range(ignore_iou_thresh, pos_iou_thresh) don’t get penalized of objectness score.
  • num_sync_bn_devices (int, default is -1) – Number of devices for training. If num_sync_bn_devices < 2, SyncBatchNorm is disabled.
classes

Return names of (non-background) categories.

Returns:Names of (non-background) categories.
Return type:iterable of str
hybrid_forward(F, x, *args)[source]

YOLOV3 network hybrid forward.

Parameters:
  • F (mxnet.nd or mxnet.sym) – F is mxnet.sym if hybridized or mxnet.nd if not.
  • x (mxnet.nd.NDArray) – Input data.
  • *args (optional, mxnet.nd.NDArray) – During training, extra inputs are required: (gt_boxes, obj_t, centers_t, scales_t, weights_t, clas_t) These are generated by YOLOV3PrefetchTargetGenerator in dataloader transform function.
Returns:

During inference, return detections in shape (B, N, 6) with format (cid, score, xmin, ymin, xmax, ymax) During training, return losses only: (obj_loss, center_loss, scale_loss, cls_loss).

Return type:

(tuple of) mxnet.nd.NDArray

num_class

Number of (non-background) categories.

Returns:Number of (non-background) categories.
Return type:int
reset_class(classes)[source]

Reset class categories and class predictors.

Parameters:classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
set_nms(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]

Set non-maximum suppression parameters.

Parameters:
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
Returns:

Return type:

None

gluoncv.model_zoo.alexnet(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

AlexNet model from the “One weird trick…” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet110_v1(**kwargs)[source]

ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet110_v2(**kwargs)[source]

ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet20_v1(**kwargs)[source]

ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet20_v2(**kwargs)[source]

ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet56_v1(**kwargs)[source]

ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet56_v2(**kwargs)[source]

ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet16_10(**kwargs)[source]

WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet28_10(**kwargs)[source]

WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet40_8(**kwargs)[source]

WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.darknet53(**kwargs)[source]

Darknet v3 53 layer network. Reference: https://arxiv.org/pdf/1804.02767.pdf.

Returns:Darknet network.
Return type:mxnet.gluon.HybridBlock
gluoncv.model_zoo.densenet121(**kwargs)[source]

Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.densenet161(**kwargs)[source]

Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.densenet169(**kwargs)[source]

Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.densenet201(**kwargs)[source]

Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.faster_rcnn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet50_v1b_custom(classes, transfer=None, pretrained_base=True, pretrained=False, **kwargs)[source]

Faster RCNN model with resnet50_v1b base network on custom dataset.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other datasets.
  • pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Returns:

Hybrid faster RCNN network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.faster_rcnn_resnet50_v1b_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v1b_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_cifar_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • version (int) – Version of ResNet. Options are 1, 2.
  • num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_cifar_wide_resnet(num_layers, width_factor=1, drop_rate=0.0, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
  • width_factor (int) – The width factor to apply to the number of channels from the original resnet.
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_darknet(darknet_version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get darknet by version and num_layers info.

Parameters:
  • darknet_version (str) – Darknet version, choices are [‘v3’].
  • num_layers (int) – Number of layers.
  • pretrained (boolean) – Whether fetch and load pre-trained weights.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Returns:

Darknet network.

Return type:

mxnet.gluon.HybridBlock

Examples

>>> model = get_darknet('v3', 53, pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

DeepLabV3 :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_ade(**kwargs)[source]

DeepLabV3 :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_deeplab_resnet101_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_coco(**kwargs)[source]

DeepLabV3 :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_deeplab_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet101_voc(**kwargs)[source]

DeepLabV3 :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_deeplab_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_deeplab_resnet50_ade(**kwargs)[source]

DeepLabV3 :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_deeplab_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_faster_rcnn(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Utility function to return faster rcnn networks.

Parameters:
  • name (str) – Model name.
  • dataset (str) – The name of dataset.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

The Faster-RCNN network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]

FCN model from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • dataset (str, default pascal_voc) – The dataset that model pretrained on. (pascal_voc, ade20k)
  • pretrained (bool, default False) – Whether to load the pretrained weights for model. This will load FCN pretrained.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • pretrained_base (bool, default True) – This will load pretrained backbone network, that was trained on ImageNet.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_ade(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_coco(**kwargs)[source]

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet101_voc(**kwargs)[source]

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet50_ade(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_resnet50_voc(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_resnet50_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_mask_rcnn(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Utility function to return mask rcnn networks.

Parameters:
  • name (str) – Model name.
  • dataset (str) – The name of dataset.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

The Mask RCNN network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_mobilenet(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.

Parameters:
  • multiplier (float) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
gluoncv.model_zoo.get_mobilenet_v2(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • multiplier (float) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
gluoncv.model_zoo.get_model(name, **kwargs)[source]

Returns a pre-defined model by name

Parameters:
  • name (str) – Name of the model.
  • pretrained (bool) – Whether to load the pretrained weights for model.
  • classes (int) – Number of classes for the output layer.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Returns:

The model.

Return type:

HybridBlock

gluoncv.model_zoo.get_model_list()[source]

Get the entire list of model names in model_zoo.

Returns:Entire list of model names in model_zoo.
Return type:list of str
gluoncv.model_zoo.get_nasnet(repeat=6, penultimate_filters=4032, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters:
  • repeat (int) – Number of cell repeats
  • penultimate_filters (int) – Number of filters in the penultimate layer of the network
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_psp(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]

Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’ :param pretrained_base: This will load pretrained backbone network, that was trained on ImageNet. :type pretrained_base: bool, default True

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_ade(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_psp_resnet101_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_coco(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_psp_resnet101_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet101_voc(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_psp_resnet101_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_psp_resnet50_ade(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_psp_resnet50_ade(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', use_se=False, **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • version (int) – Version of ResNet. Options are 1, 2.
  • num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
  • use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
gluoncv.model_zoo.get_se_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • version (int) – Version of ResNet. Options are 1, 2.
  • num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_ssd(name, base_size, features, filters, sizes, ratios, steps, classes, dataset, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get SSD models.

Parameters:
  • name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.
  • base_size (int) – Base image size for training, this is fixed once training is assigned. A fixed base size still allows you to have variable input size during test.
  • features (iterable of str or HybridBlock) – List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate mutliple outputs for prediction.
  • filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.
  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.
  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
  • steps (list of int) – Step size of anchor boxes in each output layer.
  • classes (iterable of str) – Names of categories.
  • dataset (str) – Name of dataset. This is used to identify model name because models trained on differnet datasets are going to be very different.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.get_vgg(num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • num_layers (int) – Number of layers for the variant of densenet. Options are 11, 13, 16, 19.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
gluoncv.model_zoo.get_vgg_atrous_extractor(num_layers, im_size, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get VGG atrous feature extractor networks.

Parameters:
  • num_layers (int) – VGG types, can be 11,13,16,19.
  • im_size (int) – VGG detection input size, can be 300, 512.
  • pretrained (bool) – Load pretrained weights if True.
  • ctx (mx.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

The returned network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_yolov3(name, stages, filters, anchors, strides, classes, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get YOLOV3 models.

Parameters:
  • name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.
  • features (iterable of str or HybridBlock) – List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate mutliple outputs for prediction.
  • filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.
  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.
  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
  • steps (list of int) – Step size of anchor boxes in each output layer.
  • classes (iterable of str) – Names of categories.
  • dataset (str) – Name of dataset. This is used to identify model name because models trained on differnet datasets are going to be very different.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

A YOLOV3 detection network.

Return type:

HybridBlock

gluoncv.model_zoo.inception_v3(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
gluoncv.model_zoo.mask_rcnn_resnet50_v1b_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = mask_rcnn_resnet50_v1b_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.mobilenet0_25(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.25.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet0_5(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.5.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet0_75(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.75.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet1_0(**kwargs)[source]

MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 1.0.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet_v2_0_25(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet_v2_0_5(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet_v2_0_75(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.mobilenet_v2_1_0(**kwargs)[source]

MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:

Mobile Networks for Classification, Detection and Segmentation”

<https://arxiv.org/abs/1801.04381>`_ paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
gluoncv.model_zoo.nasnet_4_1056(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters:
  • repeat (int) – Number of cell repeats
  • penultimate_filters (int) – Number of filters in the penultimate layer of the network
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.nasnet_5_1538(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters:
  • repeat (int) – Number of cell repeats
  • penultimate_filters (int) – Number of filters in the penultimate layer of the network
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.nasnet_6_4032(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters:
  • repeat (int) – Number of cell repeats
  • penultimate_filters (int) – Number of filters in the penultimate layer of the network
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.nasnet_7_1920(**kwargs)[source]

NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper

Parameters:
  • repeat (int) – Number of cell repeats
  • penultimate_filters (int) – Number of filters in the penultimate layer of the network
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet101_v1(**kwargs)[source]

ResNet-101 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet101_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-101 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.BatchNorm;
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
gluoncv.model_zoo.resnet101_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1c-101 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet101_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet101_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet101_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-101 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet101_v2(**kwargs)[source]

ResNet-101 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet152_v1(**kwargs)[source]

ResNet-152 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet152_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-152 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.BatchNorm;
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
gluoncv.model_zoo.resnet152_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-152 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet152_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet152_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet152_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-152 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet152_v2(**kwargs)[source]

ResNet-152 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet18_v1(**kwargs)[source]

ResNet-18 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet18_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-18 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.BatchNorm; for Synchronized Cross-GPU BachNormalization).
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
gluoncv.model_zoo.resnet18_v2(**kwargs)[source]

ResNet-18 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet34_v1(**kwargs)[source]

ResNet-34 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet34_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-34 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.BatchNorm;
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
gluoncv.model_zoo.resnet34_v2(**kwargs)[source]

ResNet-34 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet50_v1(**kwargs)[source]

ResNet-50 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.resnet50_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.BatchNorm;
  • last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
  • use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
gluoncv.model_zoo.resnet50_v1c(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1c-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v1d(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1d-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v1e(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1e-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v1s(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1s-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v2(**kwargs)[source]

ResNet-50 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet101_v1(**kwargs)[source]

SE-ResNet-101 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet101_v2(**kwargs)[source]

SE-ResNet-101 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet152_v1(**kwargs)[source]

SE-ResNet-152 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet152_v2(**kwargs)[source]

SE-ResNet-152 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet18_v1(**kwargs)[source]

SE-ResNet-18 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet18_v2(**kwargs)[source]

SE-ResNet-18 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet34_v1(**kwargs)[source]

SE-ResNet-34 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet34_v2(**kwargs)[source]

SE-ResNet-34 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet50_v1(**kwargs)[source]

SE-ResNet-50 V1 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet50_v2(**kwargs)[source]

SE-ResNet-50 V2 model from “Squeeze-and-Excitation Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.squeezenet1_0(**kwargs)[source]

SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.squeezenet1_1(**kwargs)[source]

SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.ssd_300_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_300_vgg16_atrous_custom(classes, pretrained_base=True, transfer=None, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
Returns:

A SSD detection network.

Return type:

HybridBlock

Example

>>> net = ssd_300_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_300_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
gluoncv.model_zoo.ssd_300_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_custom(classes, pretrained_base=True, transfer=None, **kwargs)[source]

SSD architecture with mobilenet1.0 512 base network for custom dataset.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
Returns:

A SSD detection network.

Return type:

HybridBlock

Example

>>> net = ssd_512_mobilenet1_0_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_mobilenet1_0_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_mobilenet1_0_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet101_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 101 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet152_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 152 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet18_v1_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 18 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet18_v1_custom(classes, pretrained_base=True, transfer=None, **kwargs)[source]

SSD architecture with ResNet18 v1 512 base network for COCO.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
Returns:

A SSD detection network.

Return type:

HybridBlock

Example

>>> net = ssd_512_resnet18_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_resnet18_v1_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_resnet18_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 18 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_custom(classes, pretrained_base=True, transfer=None, **kwargs)[source]

SSD architecture with ResNet50 v1 512 base network for custom dataset.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
Returns:

A SSD detection network.

Return type:

HybridBlock

Example

>>> net = ssd_512_resnet50_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_resnet50_v1_custom(classes=['foo', 'bar'], transfer='voc')
gluoncv.model_zoo.ssd_512_resnet50_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous layers for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_custom(classes, pretrained_base=True, transfer=None, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
Returns:

A SSD detection network.

Return type:

HybridBlock

Example

>>> net = ssd_512_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True)
>>> net = ssd_512_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
gluoncv.model_zoo.ssd_512_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 512x512 base network.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.vgg11(**kwargs)[source]

VGG-11 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg11_bn(**kwargs)[source]

VGG-11 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg13(**kwargs)[source]

VGG-13 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg13_bn(**kwargs)[source]

VGG-13 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg16(**kwargs)[source]

VGG-16 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg16_atrous_300(**kwargs)[source]

Get VGG atrous 16 layer 300 in_size feature extractor networks.

gluoncv.model_zoo.vgg16_atrous_512(**kwargs)[source]

Get VGG atrous 16 layer 512 in_size feature extractor networks.

gluoncv.model_zoo.vgg16_bn(**kwargs)[source]

VGG-16 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg19(**kwargs)[source]

VGG-19 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.vgg19_bn(**kwargs)[source]

VGG-19 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
gluoncv.model_zoo.yolo3_darknet53_coco(pretrained_base=True, pretrained=False, num_sync_bn_devices=-1, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on COCO dataset.

Parameters:
  • pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
  • pretrained (boolean) – Whether fetch and load pretrained weights for the entire network.
  • num_sync_bn_devices (int, default is -1) – Number of devices for training. If num_sync_bn_devices < 2, SyncBatchNorm is disabled.
Returns:

Fully hybrid yolo3 network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_darknet53_custom(classes, transfer=None, pretrained_base=True, pretrained=False, num_sync_bn_devices=-1, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on custom dataset.

Parameters:
  • classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
  • transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
  • pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
  • num_sync_bn_devices (int, default is -1) – Number of devices for training. If num_sync_bn_devices < 2, SyncBatchNorm is disabled.
Returns:

Fully hybrid yolo3 network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.yolo3_darknet53_voc(pretrained_base=True, pretrained=False, num_sync_bn_devices=-1, **kwargs)[source]

YOLO3 multi-scale with darknet53 base network on VOC dataset.

Parameters:
  • pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
  • pretrained (boolean) – Whether fetch and load pretrained weights for the entire network.
  • num_sync_bn_devices (int) – Number of devices for training. If num_sync_bn_devices < 2, SyncBatchNorm is disabled.
Returns:

Fully hybrid yolo3 network.

Return type:

mxnet.gluon.HybridBlock