gluoncv.model_zoo

Gluon Vision Model Zoo

gluoncv.model_zoo.get_model

Returns a pre-defined GluonCV model by name.

Hint

This is the recommended method for getting a pre-defined model.

It support directly loading models from Gluon Model Zoo as well.

get_model Returns a pre-defined model by name

Image Classification

CIFAR

get_cifar_resnet ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet20_v1 ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet56_v1 ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet110_v1 ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
cifar_resnet20_v2 ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
cifar_resnet56_v2 ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
cifar_resnet110_v2 ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
get_cifar_wide_resnet ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.
cifar_wideresnet16_10 WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.
cifar_wideresnet28_10 WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.
cifar_wideresnet40_8 WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

ImageNet

We apply dilattion strategy to pre-trained ResNet models (with stride of 8). Please see gluoncv.model_zoo.SegBaseModel for how to use it.

ResNetV1b Pre-trained ResNetV1b Model, which preduces the strides of 8 featuremaps at conv5.
resnet18_v1b Constructs a ResNetV1b-18 model.
resnet34_v1b Constructs a ResNetV1b-34 model.
resnet50_v1b Constructs a ResNetV1b-50 model.
resnet101_v1b Constructs a ResNetV1b-101 model.
resnet152_v1b Constructs a ResNetV1b-152 model.

Object Detection

SSD

SSD Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.
get_ssd Get SSD models.
ssd_300_vgg16_atrous_voc SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.
ssd_512_vgg16_atrous_voc SSD architecture with VGG16 atrous 512x512 base network.
ssd_512_resnet50_v1_voc SSD architecture with ResNet v1 50 layers.
ssd_512_resnet101_v2_voc SSD architecture with ResNet v2 101 layers.
ssd_512_resnet152_v2_voc SSD architecture with ResNet v2 152 layers.
VGGAtrousExtractor VGG Atrous multi layer feature extractor which produces multiple output feauture maps.
get_vgg_atrous_extractor Get VGG atrous feature extractor networks.
vgg16_atrous_300 Get VGG atrous 16 layer 300 in_size feature extractor networks.
vgg16_atrous_512 Get VGG atrous 16 layer 512 in_size feature extractor networks.

Faster RCNN

FasterRCNN Faster RCNN network.
get_faster_rcnn Utility function to return faster rcnn networks.
faster_rcnn_resnet50_v2a_voc Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.
faster_rcnn_resnet50_v2a_coco Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J.

Semantic Segmentation

FCN

FCN Fully Convolutional Networks for Semantic Segmentation
get_fcn FCN model from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_voc_resnet50 FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_voc_resnet101 FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
get_fcn_ade_resnet50 FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

PSPNet

PSPNet Pyramid Scene Parsing Network
get_psp Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on.
get_psp_ade_resnet50 Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model.

API Reference

Gluon Vision Model Zoo

class gluoncv.model_zoo.BasicBlockV1b(inplanes, planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, **kwargs)[source]

ResNetV1b BasicBlockV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.BottleneckV1b(inplanes, planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, last_gamma=False, **kwargs)[source]

ResNetV1b BottleneckV1b

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.FCN(nclass, backbone='resnet50', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, aux=True, ctx=cpu(0), **kwargs)[source]

Fully Convolutional Networks for Semantic Segmentation

Parameters:
  • nclass (int) – Number of categories for the training dataset.
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm;

Reference:

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” CVPR, 2015

Examples

>>> model = FCN(nclass=21, backbone='resnet50')
>>> print(model)
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.FasterRCNN(features, top_features, scales, ratios, classes, roi_mode, roi_size, stride=16, rpn_channel=1024, rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000, rpn_test_post_nms=300, num_sample=128, pos_iou_thresh=0.5, neg_iou_thresh_high=0.5, neg_iou_thresh_low=0.0, pos_ratio=0.25, **kwargs)[source]

Faster RCNN network.

Parameters:
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
  • train_patterns (str) – Matching pattern for trainable parameters.
  • scales (iterable of float) –

    The areas of anchor boxes. We use the following form to compute the shapes of anchors:

    \[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]
  • ratios (iterable of float) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.
  • classes (iterable of str) – Names of categories, its length is num_class.
  • roi_mode (str) – ROI pooling mode. Currently support ‘pool’ and ‘align’.
  • roi_size (tuple of int, length 2) – (height, width) of the ROI region.
  • stride (int, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size.
  • rpn_channel (int, default is 1024) – Channel number used in RPN convolutional layers.
  • rpn_train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training of RPN.
  • rpn_train_post_nms (int, default is 2000) – Return top proposal results after NMS in training of RPN.
  • rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
  • rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
  • nms_thresh (float, default is 0.3.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
  • num_sample (int, default is 128) – Number of samples for RCNN targets.
  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.
  • neg_iou_thresh_high (float, default is 0.5) – Proposal whose IOU smaller than neg_iou_thresh_high and larger than neg_iou_thresh_low is regarded as negative samples. Proposals with IOU in between pos_iou_thresh and neg_iou_thresh are ignored.
  • neg_iou_thresh_low (float, default is 0.0) – See neg_iou_thresh_high.
  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.
hybrid_forward(F, x, gt_box=None)[source]

Forward Faster-RCNN network.

The behavior during traing and inference is different.

Parameters:
  • x (mxnet.nd.NDArray or mxnet.symbol) – The network input tensor.
  • gt_box (type, only required during training) – The ground-truth bbox tensor with shape (1, N, 4).
Returns:

During inference, returns final class id, confidence scores, bounding boxes.

Return type:

(ids, scores, bboxes)

target_generator

Returns stored target generator

Returns:The RCNN target generator
Return type:mxnet.gluon.HybridBlock
class gluoncv.model_zoo.HybridBlock(prefix=None, params=None)[source]

HybridBlock supports forwarding with both Symbol and NDArray.

HybridBlock is similar to Block, with a few differences:

import mxnet as mx
from mxnet.gluon import HybridBlock, nn

class Model(HybridBlock):
    def __init__(self, **kwargs):
        super(Model, self).__init__(**kwargs)
        # use name_scope to give child Blocks appropriate names.
        with self.name_scope():
            self.dense0 = nn.Dense(20)
            self.dense1 = nn.Dense(20)

    def hybrid_forward(self, F, x):
        x = F.relu(self.dense0(x))
        return F.relu(self.dense1(x))

model = Model()
model.initialize(ctx=mx.cpu(0))
model.hybridize()
model(mx.nd.zeros((10, 10), ctx=mx.cpu(0)))

Forward computation in HybridBlock must be static to work with Symbol s, i.e. you cannot call NDArray.asnumpy(), NDArray.shape, NDArray.dtype, NDArray indexing (x[i]) etc on tensors. Also, you cannot use branching or loop logic that bases on non-constant expressions like random numbers or intermediate results, since they change the graph structure for each iteration.

Before activating with hybridize(), HybridBlock works just like normal Block. After activation, HybridBlock will create a symbolic graph representing the forward computation and cache it. On subsequent forwards, the cached graph will be used instead of hybrid_forward().

Please see references for detailed tutorial.

References

Hybrid - Faster training and easy deployment

cast(dtype)[source]

Cast this Block to use another data type.

Parameters:dtype (str or numpy.dtype) – The new data type.
export(path, epoch=0)[source]

Export HybridBlock to json format that can be loaded by SymbolBlock.imports, mxnet.mod.Module or the C++ interface.

Note

When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.

Parameters:
  • path (str) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number.
  • epoch (int) – Epoch number of saved model.
forward(x, *args)[source]

Defines the forward computation. Arguments can be either NDArray or Symbol.

hybrid_forward(F, x, *args, **kwargs)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
hybridize(active=True, **kwargs)[source]

Activates or deactivates HybridBlock s recursively. Has no effect on non-hybrid children.

Parameters:
  • active (bool, default True) – Whether to turn hybrid on or off.
  • static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
  • static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
infer_shape(*args)[source]

Infers shape of Parameters from inputs.

infer_type(*args)[source]

Infers data type of Parameters from inputs.

register_child(block, name=None)[source]

Registers block as a child of self. Block s assigned to self as attributes will be registered automatically.

class gluoncv.model_zoo.PSPNet(nclass, backbone='resnet50', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, aux=True, ctx=cpu(0), **kwargs)[source]

Pyramid Scene Parsing Network

Parameters:
  • nclass (int) – Number of categories for the training dataset.
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).
  • aux (bool) – Auxilary loss.

Reference:

Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. “Pyramid scene parsing network.” CVPR, 2017
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.ResNetV1b(block, layers, classes=1000, dilated=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, last_gamma=False, **kwargs)[source]

Pre-trained ResNetV1b Model, which preduces the strides of 8 featuremaps at conv5.

Parameters:
  • block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • classes (int, default 1000) – Number of classification classes.
  • dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer; for Synchronized Cross-GPU BachNormalization).

Reference:

  • He, Kaiming, et al. “Deep residual learning for image recognition.”

Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

  • Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BasicBlockV1(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BasicBlockV2(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 18, 34 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BottleneckV1(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_BottleneckV2(channels, stride, downsample=False, in_channels=0, **kwargs)[source]

Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 50, 101, 152 layers.

Parameters:
  • channels (int) – Number of output channels.
  • stride (int) – Stride size.
  • downsample (bool, default False) – Whether to downsample the input.
  • in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_ResNetV1(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SE_ResNetV2(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]

SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
  • layers (list of int) – Numbers of layers in each block
  • channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
  • classes (int, default 1000) – Number of classification classes.
  • thumbnail (bool, default False) – Enable thumbnail.
hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
class gluoncv.model_zoo.SSD(network, base_size, features, num_filters, sizes, ratios, steps, classes, use_1x1_transition=True, use_bn=True, reduce_ratio=1.0, min_depth=128, global_pool=False, pretrained=False, stds=(0.1, 0.1, 0.2, 0.2), nms_thresh=0.45, nms_topk=400, post_nms=100, anchor_alloc_size=128, ctx=cpu(0), **kwargs)[source]

Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.

Parameters:
  • network (string or None) – Name of the base network, if None is used, will instantiate the base network from features directly instead of composing.
  • base_size (int) – Base input size, it is speficied so SSD can support dynamic input shapes.
  • features (list of str or mxnet.gluon.HybridBlock) – Intermediate features to be extracted or a network with multi-output. If network is None, features is expected to be a multi-output network.
  • num_filters (list of int) – Number of channels for the appended layers, ignored if network`is `None.
  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.
  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
  • steps (list of int) – Step size of anchor boxes in each output layer.
  • classes (iterable of str) – Names of all categories.
  • use_1x1_transition (bool) – Whether to use 1x1 convolution as transition layer between attached layers, it is effective reducing model capacity.
  • use_bn (bool) – Whether to use BatchNorm layer after each attached convolutional layer.
  • reduce_ratio (float) – Channel reduce ratio (0, 1) of the transition layer.
  • min_depth (int) – Minimum channels for the transition layers.
  • global_pool (bool) – Whether to attach a global average pooling layer as the last output layer.
  • pretrained (bool) – Description of parameter pretrained.
  • stds (tuple of float, default is (0.1, 0.1, 0.2, 0.2)) – Std values to be divided/multiplied to box encoded values.
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
  • anchor_alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define anchor_alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, scalar, etc.
  • ctx (mx.Context) – Network context.
hybrid_forward(F, x)[source]

Hybrid forward

set_nms(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]

Set non-maximum suppression parameters.

Parameters:
  • nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can speficy < 0 or > 1 to disable NMS.
  • nms_topk (int, default is 400) –
    Apply NMS to top k detection results, use -1 to disable so that every Detection
    result is used in NMS.
  • post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
Returns:

Return type:

None

class gluoncv.model_zoo.SegBaseModel(nclass, aux, backbone='resnet50', height=480, width=480, **kwargs)[source]

Base Model for Semantic Segmentation

Parameters:
  • backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
  • norm_layer (Block) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).
base_forward(x)[source]

forwarding pre-trained network

evaluate(x, target=None)[source]

evaluating network with inputs and targets

class gluoncv.model_zoo.VGGAtrousExtractor(layers, filters, extras, batch_norm=False, **kwargs)[source]

VGG Atrous multi layer feature extractor which produces multiple output feauture maps.

Parameters:
  • layers (list of int) – Number of layer for vgg base network.
  • filters (list of int) – Number of convolution filters for each layer.
  • extras (list of list) – Extra layers configurations.
  • batch_norm (bool) – If True, will use BatchNorm layers.
hybrid_forward(F, x, init_scale)[source]

Overrides to construct symbolic graph for this Block.

Parameters:
  • x (Symbol or NDArray) – The first input tensor.
  • *args (list of Symbol or list of NDArray) – Additional input tensors.
gluoncv.model_zoo.cifar_resnet110_v1(**kwargs)[source]

ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet110_v2(**kwargs)[source]

ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet20_v1(**kwargs)[source]

ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet20_v2(**kwargs)[source]

ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet56_v1(**kwargs)[source]

ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_resnet56_v2(**kwargs)[source]

ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet16_10(**kwargs)[source]

WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet28_10(**kwargs)[source]

WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cifar_wideresnet40_8(**kwargs)[source]

WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.

Parameters:
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.cpu(device_id=0)[source]

Returns a CPU context.

This function is a short cut for Context('cpu', device_id). For most operations, when no context is specified, the default context is cpu().

Examples

>>> with mx.cpu():
...     cpu_array = mx.nd.ones((2, 3))
>>> cpu_array.context
cpu(0)
>>> cpu_array = mx.nd.ones((2, 3), ctx=mx.cpu())
>>> cpu_array.context
cpu(0)
Parameters:device_id (int, optional) – The device id of the device. device_id is not needed for CPU. This is included to make interface compatible with GPU.
Returns:context – The corresponding CPU context.
Return type:Context
gluoncv.model_zoo.faster_rcnn_resnet50_v2a_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v2a_coco(pretrained=True)
>>> print(model)
gluoncv.model_zoo.faster_rcnn_resnet50_v2a_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_faster_rcnn_resnet50_v2a_voc(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_cifar_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • version (int) – Version of ResNet. Options are 1, 2.
  • num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_cifar_wide_resnet(num_layers, width_factor=1, drop_rate=0.0, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
  • width_factor (int) – The width factor to apply to the number of channels from the original resnet.
  • drop_rate (float) – The rate of dropout.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_faster_rcnn(name, features, top_features, scales, ratios, classes, roi_mode, roi_size, dataset, stride=16, rpn_channel=1024, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Utility function to return faster rcnn networks.

Parameters:
  • name (str) – Model name.
  • features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
  • top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
  • scales (iterable of float) –

    The areas of anchor boxes. We use the following form to compute the shapes of anchors:

    \[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]
  • ratios (iterable of float) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.
  • classes (iterable of str) – Names of categories, its length is num_class.
  • roi_mode (str) – ROI pooling mode. Currently support ‘pool’ and ‘align’.
  • roi_size (tuple of int, length 2) – (height, width) of the ROI region.
  • dataset (str) – The name of dataset.
  • stride (int, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size.
  • rpn_channel (int, default is 1024) – Channel number used in RPN convolutional layers.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

The Faster-RCNN network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

FCN model from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • dataset (str, default pascal_voc) – The dataset that model pretrained on. (pascal_voc, ade20k)
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_fcn_ade_resnet50(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_ade_resnet50(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_voc_resnet101(**kwargs)[source]

FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_voc_resnet101(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_fcn_voc_resnet50(**kwargs)[source]

FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.

Examples

>>> model = get_fcn_voc_resnet50(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_model(name, **kwargs)[source]

Returns a pre-defined model by name

Parameters:
  • name (str) – Name of the model.
  • pretrained (bool) – Whether to load the pretrained weights for model.
  • classes (int) – Number of classes for the output layer.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Returns:

The model.

Return type:

HybridBlock

gluoncv.model_zoo.get_psp(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False)
>>> print(model)
gluoncv.model_zoo.get_psp_ade_resnet50(**kwargs)[source]

Pyramid Scene Parsing Network :param pretrained: Whether to load the pretrained weights for model. :type pretrained: bool, default False :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’

Examples

>>> model = get_fcn_ade_resnet50(pretrained=True)
>>> print(model)
gluoncv.model_zoo.get_se_resnet(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • version (int) – Version of ResNet. Options are 1, 2.
  • num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.get_ssd(name, base_size, features, filters, sizes, ratios, steps, classes, dataset, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get SSD models.

Parameters:
  • name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.
  • base_size (int) – Base image size for training, this is fixed once training is assigned. A fixed base size still allows you to have variable input size during test.
  • features (iterable of str or HybridBlock) – List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate mutliple outputs for prediction.
  • filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.
  • sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have sizes = [30, 60, 90], and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.
  • ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
  • steps (list of int) – Step size of anchor boxes in each output layer.
  • classes (iterable of str) – Names of categories.
  • dataset (str) – Name of dataset. This is used to identify model name because models trained on differnet datasets are going to be very different.
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
  • ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.get_vgg_atrous_extractor(num_layers, im_size, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]

Get VGG atrous feature extractor networks.

Parameters:
  • num_layers (int) – VGG types, can be 11,13,16,19.
  • im_size (int) – VGG detection input size, can be 300, 512.
  • pretrained (bool) – Load pretrained weights if True.
  • ctx (mx.Context) – Context such as mx.cpu(), mx.gpu(0).
  • root (str) – Model weights storing path.
Returns:

The returned network.

Return type:

mxnet.gluon.HybridBlock

gluoncv.model_zoo.resnet101_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-101 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet152_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-152 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet18_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-18 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer; for Synchronized Cross-GPU BachNormalization).
gluoncv.model_zoo.resnet34_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-34 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v1b(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNetV1b-50 model.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yilding a stride 8 model.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.norm_layer;
gluoncv.model_zoo.resnet50_v2a(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]

Constructs a ResNet50-v2a model.

Please ignore this if you are looking for model for other tasks.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
  • ctx (Context, default mx.cpu(0)) – The context in which to load the pretrained weights.
  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm;
gluoncv.model_zoo.se_resnet101_v1(**kwargs)[source]

SE_ResNet-101 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet101_v2(**kwargs)[source]

SE_ResNet-101 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet152_v1(**kwargs)[source]

SE_ResNet-152 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet152_v2(**kwargs)[source]

SE_ResNet-152 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet18_v1(**kwargs)[source]

SE_ResNet-18 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet18_v2(**kwargs)[source]

SE_ResNet-18 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet34_v1(**kwargs)[source]

SE_ResNet-34 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet34_v2(**kwargs)[source]

SE_ResNet-34 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet50_v1(**kwargs)[source]

SE_ResNet-50 V1 model from “Deep Residual Learning for Image Recognition” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.se_resnet50_v2(**kwargs)[source]

SE_ResNet-50 V2 model from “Identity Mappings in Deep Residual Networks” paper.

Parameters:
  • pretrained (bool, default False) – Whether to load the pretrained weights for model.
  • ctx (Context, default CPU) – The context in which to load the pretrained weights.
  • root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
gluoncv.model_zoo.ssd_300_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_300_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_mobilenet1_0_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with mobilenet1.0 base networks.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet101_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 101 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet152_v2_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v2 152 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet18_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 18 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_resnet50_v1_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with ResNet v1 50 layers.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_coco(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous layers for COCO.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.ssd_512_vgg16_atrous_voc(pretrained=False, pretrained_base=True, **kwargs)[source]

SSD architecture with VGG16 atrous 512x512 base network.

Parameters:
  • pretrained (bool, optional, default is False) – Load pretrained weights.
  • pretrained_base (bool, optional, default is True) – Load pretrained base network, the extra layers are randomized.
Returns:

A SSD detection network.

Return type:

HybridBlock

gluoncv.model_zoo.vgg16_atrous_300(**kwargs)[source]

Get VGG atrous 16 layer 300 in_size feature extractor networks.

gluoncv.model_zoo.vgg16_atrous_512(**kwargs)[source]

Get VGG atrous 16 layer 512 in_size feature extractor networks.