gluoncv.model_zoo¶
GluonCV Model Zoo
gluoncv.model_zoo.get_model¶
Returns a pre-defined GluonCV model by name.
Hint
This is the recommended method for getting a pre-defined model.
It support directly loading models from Gluon Model Zoo as well.
Returns a pre-defined model by name |
Image Classification¶
CIFAR¶
ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. |
|
ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper. |
|
ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper. |
|
ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper. |
|
ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper. |
|
ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper. |
|
ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper. |
|
ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. |
|
WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper. |
|
WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper. |
|
WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper. |
ImageNet¶
We apply dilattion strategy to pre-trained ResNet models (with stride of 8). Please see gluoncv.model_zoo.SegBaseModel
for how to use it.
Pre-trained ResNetV1b Model, which produces the strides of 8 featuremaps at conv5. |
|
Constructs a ResNetV1b-18 model. |
|
Constructs a ResNetV1b-34 model. |
|
Constructs a ResNetV1b-50 model. |
|
Constructs a ResNetV1b-101 model. |
|
Constructs a ResNetV1b-152 model. |
ResNext¶
ResNext model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
ResNext model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
ResNext50 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
ResNext101 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
ResNext101 64x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
SE-ResNext50 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
SE-ResNext101 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
SE-ResNext101 64x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper. |
|
SE-ResNext101e 64x4d model modified from “Aggregated Residual Transformations for Deep Neural Network” paper. |
ResNeSt¶
ResNeSt Model :param block: Class for the residual block. Options are BasicBlockV1, BottleneckV1. :type block: Block :param layers: Numbers of layers in each block :type layers: list of int :param classes: Number of classification classes. :type classes: int, default 1000 :param dilated: Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation. :type dilated: bool, default False :param norm_layer: Normalization layer used (default: |
|
Constructs a ResNeSt-14 model. |
|
Constructs a ResNeSt-26 model. |
|
Constructs a ResNeSt-50 model. |
|
Constructs a ResNeSt-101 model. |
|
Constructs a ResNeSt-200 model. |
|
Constructs a ResNeSt-269 model. |
MobileNet¶
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper. |
|
MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper. :param multiplier: The width multiplier for controlling the model size. The actual number of channels is equal to the original channel size multiplied by this multiplier. :type multiplier: float, default 1.0 :param classes: Number of classes for the output layer. :type classes: int, default 1000 :param norm_layer: Normalization layer used (default: |
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper. |
|
MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper. |
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 1.0. |
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.75. |
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.5. |
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.25. |
SqueezeNet¶
SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. |
|
SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. |
|
SqueezeNet 1.1 model from the official SqueezeNet repo. |
DenseNet¶
Densenet-BC model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper. |
Object Detection¶
SSD¶
Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325. |
|
Get SSD models. |
|
SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC. |
|
SSD architecture with VGG16 atrous 300x300 base network for COCO. |
|
SSD architecture with VGG16 atrous 300x300 base network for COCO. |
|
SSD architecture with VGG16 atrous 512x512 base network. |
|
SSD architecture with VGG16 atrous layers for COCO. |
|
SSD architecture with VGG16 atrous 300x300 base network for COCO. |
|
SSD architecture with ResNet v1 50 layers. |
|
SSD architecture with ResNet v1 50 layers for COCO. |
|
SSD architecture with ResNet50 v1 512 base network for custom dataset. |
|
SSD architecture with ResNet v2 101 layers. |
|
SSD architecture with ResNet v2 152 layers. |
|
VGG Atrous multi layer feature extractor which produces multiple output feature maps. |
|
Get VGG atrous feature extractor networks. |
|
Get VGG atrous 16 layer 300 in_size feature extractor networks. |
|
Get VGG atrous 16 layer 512 in_size feature extractor networks. |
Faster RCNN¶
Faster RCNN network. |
|
Utility function to return faster rcnn networks. |
|
Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. |
|
Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. |
|
Faster RCNN model with resnet50_v1b base network on custom dataset. |
YOLOv3¶
YOLO V3 detection network. Reference: https://arxiv.org/pdf/1804.02767.pdf. :param stages: Staged feature extraction blocks. For example, 3 stages and 3 YOLO output layers are used original paper. :type stages: mxnet.gluon.HybridBlock :param channels: Number of conv channels for each appended stage. len(channels) should match len(stages). :type channels: iterable :param num_class: Number of foreground objects. :type num_class: int :param anchors: The anchor setting. len(anchors) should match len(stages). :type anchors: iterable :param strides: Strides of feature map. len(strides) should match len(stages). :type strides: iterable :param alloc_size: For advanced users. Define alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, Scalar, etc. :type alloc_size: tuple of int, default is (128, 128) :param nms_thresh: Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS. :type nms_thresh: float, default is 0.45. :param nms_topk: Apply NMS to top k detection results, use -1 to disable so that every Detection result is used in NMS. :type nms_topk: int, default is 400 :param post_nms: Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections. :type post_nms: int, default is 100 :param pos_iou_thresh: IOU threshold for true anchors that match real objects. ‘pos_iou_thresh < 1’ is not implemented. :type pos_iou_thresh: float, default is 1.0 :param ignore_iou_thresh: Anchors that has IOU in range(ignore_iou_thresh, pos_iou_thresh) don’t get penalized of objectness score. :type ignore_iou_thresh: float :param norm_layer: Normalization layer used (default: |
|
Get YOLOV3 models. :param name: Model name, if None is used, you must specify features to be a HybridBlock. :type name: str or None :param stages: List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate multiple outputs for prediction. :type stages: iterable of str or HybridBlock :param filters: List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored. :type filters: iterable of float or None :param sizes: Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have |
|
YOLO3 multi-scale with darknet53 base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained_base: bool or str :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param norm_layer: Normalization layer used (default: |
|
YOLO3 multi-scale with darknet53 base network on COCO dataset. :param pretrained_base: Whether fetch and load pretrained weights for base network. :type pretrained_base: boolean :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param norm_layer: Normalization layer used (default: |
|
YOLO3 multi-scale with darknet53 base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other datasets. :type transfer: str or None :param pretrained_base: Whether fetch and load pretrained weights for base network. :type pretrained_base: boolean :param norm_layer: Normalization layer used (default: |
Instance Segmentation¶
Mask RCNN¶
Mask RCNN network. |
|
Utility function to return mask rcnn networks. |
|
Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. |
Semantic Segmentation¶
FCN¶
Fully Convolutional Networks for Semantic Segmentation |
|
FCN model from the paper “Fully Convolutional Network for semantic segmentation” |
|
FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation” |
|
FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation” |
|
FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation” |
|
FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation” |
|
FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation” |
PSPNet¶
Pyramid Scene Parsing Network |
|
Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’ :param pretrained_base: This will load pretrained backbone network, that was trained on ImageNet. :type pretrained_base: bool or str, default True. |
|
Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
DeepLabV3¶
|
|
DeepLabV3 :param dataset: The dataset that model pretrained on. (pascal_voc, pascal_aug, ade20k, coco, citys) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
|
DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights. :type pretrained: bool or str :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param root: Location for keeping the model parameters. :type root: str, default ‘~/.mxnet/models’. |
Action Recognition¶
TSN¶
VGG16 model trained on UCF101 dataset. |
|
VGG16 model trained on HMDB51 dataset. |
|
VGG16 model trained on Kinetics400 dataset. |
|
VGG16 model trained on Something-Something-V2 dataset. |
|
InceptionV1 model trained on UCF101 dataset. |
|
InceptionV1 model trained on HMDB51 dataset. |
|
InceptionV1 model trained on Kinetics400 dataset. |
|
InceptionV1 model trained on Something-Something-V2 dataset. |
|
InceptionV3 model trained on UCF101 dataset. |
|
InceptionV3 model trained on HMDB51 dataset. |
|
InceptionV3 model trained on Kinetics400 dataset. |
|
InceptionV3 model trained on Something-Something-V2 dataset. |
|
ResNet18 model trained on Something-Something-V2 dataset. |
|
ResNet34 model trained on Something-Something-V2 dataset. |
|
ResNet50 model trained on Something-Something-V2 dataset. |
|
ResNet101 model trained on Something-Something-V2 dataset. |
|
ResNet152 model trained on Something-Something-V2 dataset. |
|
ResNet18 model trained on Kinetics400 dataset. |
|
ResNet34 model trained on Kinetics400 dataset. |
|
ResNet50 model trained on Kinetics400 dataset. |
|
ResNet101 model trained on Kinetics400 dataset. |
|
ResNet152 model trained on Kinetics400 dataset. |
|
ResNet50 model trained on UCF101 dataset. |
|
ResNet50 model trained on HMDB51 dataset. |
|
ResNet50 model customized for any dataset. |
C3D¶
The Convolutional 3D network (C3D). |
|
The Convolutional 3D network (C3D) trained on Kinetics400 dataset. |
I3D¶
Inception v1 model from “Going Deeper with Convolutions” paper. |
|
Inception v1 model trained on Kinetics400 dataset from “Going Deeper with Convolutions” paper. |
|
Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper. |
|
Inception v3 model trained on Kinetics400 dataset from “Rethinking the Inception Architecture for Computer Vision” paper. |
|
ResNet_I3D backbone. |
|
Inflated 3D model (I3D) with ResNet50 backbone trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet101 backbone trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone and 5 non-local blocks trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone and 10 non-local blocks trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet101 backbone and 5 non-local blocks trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet101 backbone and 10 non-local blocks trained on Kinetics400 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone trained on Something-Something-V2 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone trained on HMDB51 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone trained on UCF101 dataset. |
|
Inflated 3D model (I3D) with ResNet50 backbone. |
P3D¶
The Pseudo 3D network (P3D). |
|
The Pseudo 3D network (P3D) with ResNet50 backbone trained on Kinetics400 dataset. |
|
The Pseudo 3D network (P3D) with ResNet101 backbone trained on Kinetics400 dataset. |
R2+1D¶
The R2+1D network. |
|
R2Plus1D with ResNet18 backbone trained on Kinetics400 dataset. |
|
R2Plus1D with ResNet34 backbone trained on Kinetics400 dataset. |
|
R2Plus1D with ResNet50 backbone trained on Kinetics400 dataset. |
|
R2Plus1D with ResNet101 backbone trained on Kinetics400 dataset. |
|
R2Plus1D with ResNet152 backbone trained on Kinetics400 dataset. |
SlowFast¶
SlowFast networks (SlowFast) from “SlowFast Networks for Video Recognition” paper. |
|
SlowFast 4x16 networks (SlowFast) with ResNet50 backbone trained on Kinetics400 dataset. |
|
SlowFast 8x8 networks (SlowFast) with ResNet50 backbone trained on Kinetics400 dataset. |
|
SlowFast 4x16 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset. |
|
SlowFast 8x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset. |
|
SlowFast 16x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset. |
|
SlowFast 16x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset, but the temporal head is initialized with ResNet50 structure (3, 4, 6, 3). |
|
SlowFast 4x16 networks (SlowFast) with ResNet50 backbone. |
API Reference¶
Network definitions of GluonCV models
GluonCV Model Zoo
-
class
gluoncv.model_zoo.
ABC
[source]¶ Helper class that provides a standard way to create an ABC using inheritance.
-
class
gluoncv.model_zoo.
AlexNet
(classes=1000, **kwargs)[source]¶ AlexNet model from the “One weird trick…” paper.
- Parameters
classes (int, default 1000) – Number of classes for the output layer.
-
class
gluoncv.model_zoo.
BaseAnchorBasedTracktor
[source]¶ -
-
abstract
detect_and_track
(frame, tracking_anchor_indices, tracking_anchor_weights, tracking_classes)[source]¶ Perform detection and tracking on the new frame
- Parameters
frame (HxWx3 RGB image) –
tracking_anchor_indices (NxM ndarray) –
NxM ndarray (tracking_anchor_weights) –
tracking_classes (Nx1 ndarray of the class ids of the tracked object) –
Returns – detection_bounding_boxes: all detection results, in (x0, y0, x1, y1, confidence, cls) format detection_source: source anchor box indices for each detection tracking_boxes: all tracking results, in (x0, y0, x1, y1, confidence) format extract_info: extra information from the tracktor, e.g. landmarks, a dict
------- –
-
abstract
prepare_for_frame
(frame)[source]¶ This method should run anything that needs to happen before the motion prediction. It can prepare the detector or even run the backbone feature extractions. It can also provide data to motion prediction :param frame: :type frame: the frame data, the same as in the detect_and_track method
- Returns
motion_predict_data
- Return type
optional data provided to motion prediction, if no data is provided, return None
-
abstract
-
class
gluoncv.model_zoo.
BasicBlockV1
(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
BasicBlockV1b
(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, **kwargs)[source]¶ ResNetV1b BasicBlockV1b
-
class
gluoncv.model_zoo.
BasicBlockV2
(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
Block
(channels, cardinality, bottleneck_width, stride, downsample=False, last_gamma=False, use_se=False, avg_down=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Bottleneck Block from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
Bottleneck
(channels, cardinality=1, bottleneck_width=64, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, last_gamma=False, dropblock_prob=0, input_size=None, use_splat=False, radix=2, avd=False, avd_first=False, in_channels=None, split_drop_ratio=0, **kwargs)[source]¶ ResNeSt Bottleneck
-
class
gluoncv.model_zoo.
BottleneckV1
(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
BottleneckV1b
(planes, strides=1, dilation=1, downsample=None, previous_dilation=1, norm_layer=None, norm_kwargs=None, last_gamma=False, **kwargs)[source]¶ ResNetV1b BottleneckV1b
-
class
gluoncv.model_zoo.
BottleneckV2
(channels, stride, downsample=False, in_channels=0, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
C3D
(nclass, dropout_ratio=0.5, num_segments=1, num_crop=1, feat_ext=False, init_std=0.001, ctx=None, **kwargs)[source]¶ The Convolutional 3D network (C3D). Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV, 2015. https://arxiv.org/abs/1412.0767
- Parameters
nclass (int) – Number of classes in the training dataset.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
dropout_ratio (float) – Dropout value used in the dropout layers after dense layers to avoid overfitting.
init_std (float) – Default standard deviation value for initializing dense layers.
ctx (str) – Context, default CPU. The context in which to load the pretrained weights.
-
class
gluoncv.model_zoo.
COCODetection
(root='~/.mxnet/datasets/coco', splits=('instances_val2017'), transform=None, min_object_area=0, skip_empty=True, use_crowd=True)[source]¶ MS COCO detection dataset.
- Parameters
root (str, default '~/.mxnet/datasets/coco') – Path to folder storing the dataset.
splits (list of str, default ['instances_val2017']) – Json annotations name. Candidates can be: instances_val2017, instances_train2017.
transform (callable, default None) –
A function that takes data and label and transforms them. Refer to ./transforms for examples.
A transform function for object detection should take label into consideration, because any geometric modification will require label to be modified.
min_object_area (float) – Minimum accepted ground-truth area, if an object’s area is smaller than this value, it will be ignored.
skip_empty (bool, default is True) – Whether skip images with no valid object. This should be True in training, otherwise it will cause undefined behavior.
use_crowd (bool, default is True) – Whether use boxes labeled as crowd instance.
-
property
annotation_dir
¶ The subdir for annotations. Default is ‘annotations’(coco default) For example, a coco format json file will be searched as ‘root/annotation_dir/xxx.json’ You can override if custom dataset don’t follow the same pattern
-
property
classes
¶ Category names.
-
property
coco
¶ Return pycocotools object for evaluation purposes.
-
class
gluoncv.model_zoo.
CenterNet
(base_network, heads, classes, head_conv_channel=0, scale=4.0, topk=100, flip_test=False, nms_thresh=0, nms_topk=400, post_nms=100, **kwargs)[source]¶ Objects as Points. https://arxiv.org/abs/1904.07850v2
- Parameters
base_network (mxnet.gluon.nn.HybridBlock) – The base feature extraction network.
heads (OrderedDict) –
OrderedDict with specifications for each head. For example: OrderedDict([
(‘heatmap’, {‘num_output’: len(classes), ‘bias’: -2.19}), (‘wh’, {‘num_output’: 2}), (‘reg’, {‘num_output’: 2}) ])
classes (list of str) – Category names.
head_conv_channel (int, default is 0) – If > 0, will use an extra conv layer before each of the real heads.
scale (float, default is 4.0) – The downsampling ratio of the entire network.
topk (int, default is 100) – Number of outputs .
flip_test (bool) – Whether apply flip test in inference (training mode not affected).
nms_thresh (float, default is 0.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS. By default nms is disabled.
nms_topk (int, default is 400) –
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
-
property
num_classes
¶ Return number of foreground classes.
- Returns
Number of foreground classes
- Return type
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors.
- Parameters
classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('center_net_resnet50_v1b_voc', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the 14th category in VOC >>> net.reset_class(classes=['person'], reuse_weights={0:14}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':14}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
set_nms
(nms_thresh=0, nms_topk=400, post_nms=100)[source]¶ Set non-maximum suppression parameters.
- Parameters
nms_thresh (float, default is 0.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS. By default NMS is disabled.
nms_topk (int, default is 400) –
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
- Returns
- Return type
-
class
gluoncv.model_zoo.
DUC
(planes, upscale_factor=2, **kwargs)[source]¶ Upsampling layer with pixel shuffle
-
class
gluoncv.model_zoo.
DarknetV3
(layers, channels, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Darknet v3.
- Parameters
layers (iterable) – Description of parameter layers.
channels (iterable) – Description of parameter channels.
classes (int, default is 1000) – Number of classes, which determines the dense layer output channels.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
features
¶ Feature extraction layers.
- Type
mxnet.gluon.nn.HybridSequential
-
output
¶ A classes(1000)-way Fully-Connected Layer.
- Type
mxnet.gluon.nn.Dense
-
class
gluoncv.model_zoo.
DeepLabV3
(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, height=None, width=None, base_size=520, crop_size=480, **kwargs)[source]¶ - Parameters
nclass (int) – Number of categories for the training dataset.
backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
; for Synchronized Cross-GPU BachNormalization).aux (bool) – Auxiliary loss.
Reference:
Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).
-
class
gluoncv.model_zoo.
DeepLabV3Plus
(nclass, backbone='xception', aux=True, ctx=cpu(0), pretrained_base=True, height=None, width=None, base_size=576, crop_size=512, dilated=True, **kwargs)[source]¶ - Parameters
nclass (int) – Number of categories for the training dataset.
backbone (string) – Pre-trained dilated backbone network type (default:’xception’).
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
; for Synchronized Cross-GPU BachNormalization).aux (bool) – Auxiliary loss.
Reference:
Chen, Liang-Chieh, et al. “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.”
-
class
gluoncv.model_zoo.
DeepLabWV3Plus
(nclass, backbone='wideresnet', aux=False, ctx=cpu(0), pretrained_base=True, height=None, width=None, base_size=520, crop_size=480, dilated=True, **kwargs)[source]¶ - Parameters
nclass (int) – Number of categories for the training dataset.
backbone (string) – Pre-trained dilated backbone network type (default:’wideresnet’).
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
; for Synchronized Cross-GPU BachNormalization).aux (bool) – Auxiliary loss.
Reference – Chen, Liang-Chieh, et al. “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.”, https://arxiv.org/abs/1802.02611, ECCV 2018
-
class
gluoncv.model_zoo.
DenseNet
(num_init_features, growth_rate, block_config, bn_size=4, dropout=0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Densenet-BC model from the “Densely Connected Convolutional Networks” paper.
- Parameters
num_init_features (int) – Number of filters to learn in the first convolution layer.
growth_rate (int) – Number of filters to add each layer (k in the paper).
block_config (list of int) – List of integers for numbers of layers in each pooling block.
bn_size (int, default 4) – Multiplicative factor for number of bottle neck layers. (i.e. bn_size * k features in the bottleneck layer)
dropout (float, default 0) – Rate of dropout after each dense layer.
classes (int, default 1000) – Number of classification classes.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
DepthDecoder
(num_ch_enc, scales=range(0, 4), num_output_channels=1, use_skips=True)[source]¶ Decoder of Monodepth2
- Parameters
-
class
gluoncv.model_zoo.
DepthwiseRPN
(bz=1, is_train=False, ctx=cpu(0), anchor_num=5, out_channels=256)[source]¶ get cls and loc throught z_f and x_f
- Parameters
-
class
gluoncv.model_zoo.
DoubleHeadRCNN
(features, top_features, classes, box_features=None, short=600, max_size=1000, min_stage=4, max_stage=4, train_patterns=None, nms_thresh=0.3, nms_topk=400, post_nms=100, roi_mode='align', roi_size=(14, 14), strides=16, clip=None, rpn_channel=1024, base_size=16, scales=(8, 16, 32), ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7, rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000, rpn_test_post_nms=300, rpn_min_size=16, per_device_batch_size=1, num_sample=128, pos_iou_thresh=0.5, pos_ratio=0.25, max_num_gt=300, additional_output=False, force_nms=False, minimal_opset=False, **kwargs)[source]¶ Double Head RCNN network.
- Parameters
features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
classes (iterable of str) – Names of categories, its length is
num_class
.box_features (gluon.HybridBlock, default is None) – feature head for transforming shared ROI output (top_features) for box prediction. If set to None, global average pooling will be used.
short (int, default is 600.) – Input image short side size.
max_size (int, default is 1000.) – Maximum size of input image long side.
min_stage (int, default is 4) – Minimum stage NO. for FPN stages.
max_stage (int, default is 4) – Maximum stage NO. for FPN stages.
train_patterns (str, default is None.) – Matching pattern for trainable parameters.
nms_thresh (float, default is 0.3.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
nms_topk (int, default is 400) – Apply NMS to top k detection results, use -1 to disable so that every Detection result is used in NMS.
roi_mode (str, default is align) – ROI pooling mode. Currently support ‘pool’ and ‘align’.
roi_size (tuple of int, length 2, default is (14, 14)) – (height, width) of the ROI region.
strides (int/tuple of ints, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size. For FPN, use a tuple of ints.
clip (float, default is None) – Clip bounding box prediction to to prevent exponentiation from overflowing.
rpn_channel (int, default is 1024) – number of channels used in RPN convolutional layers.
base_size (int) – The width(and height) of reference anchor box.
scales (iterable of float, default is (8, 16, 32)) –
The areas of anchor boxes. We use the following form to compute the shapes of anchors:
\[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]ratios (iterable of float, default is (0.5, 1, 2)) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.
alloc_size (tuple of int) – Allocate size for the anchor boxes as (H, W). Usually we generate enough anchors for large feature map, e.g. 128x128. Later in inference we can have variable input sizes, at which time we can crop corresponding anchors from this large anchor map so we can skip re-generating anchors for each input.
rpn_train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training of RPN.
rpn_train_post_nms (int, default is 2000) – Return top proposal results after NMS in training of RPN. Will be set to rpn_train_pre_nms if it is larger than rpn_train_pre_nms.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN. Will be set to rpn_test_pre_nms if it is larger than rpn_test_pre_nms.
rpn_nms_thresh (float, default is 0.7) – IOU threshold for NMS. It is used to remove overlapping proposals.
rpn_num_sample (int, default is 256) – Number of samples for RPN targets.
rpn_pos_iou_thresh (float, default is 0.7) – Anchor with IOU larger than
pos_iou_thresh
is regarded as positive samples.rpn_neg_iou_thresh (float, default is 0.3) – Anchor with IOU smaller than
neg_iou_thresh
is regarded as negative samples. Anchors with IOU in betweenpos_iou_thresh
andneg_iou_thresh
are ignored.rpn_pos_ratio (float, default is 0.5) –
pos_ratio
defines how many positive samples (pos_ratio * num_sample
) is to be sampled.rpn_box_norm (array-like of size 4, default is (1., 1., 1., 1.)) – Std value to be divided from encoded values.
rpn_min_size (int, default is 16) – Proposals whose size is smaller than
min_size
will be discarded.per_device_batch_size (int, default is 1) – Batch size for each device during training.
num_sample (int, default is 128) – Number of samples for RCNN targets.
pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than
pos_iou_thresh
is regarded as positive samples.pos_ratio (float, default is 0.25) –
pos_ratio
defines how many positive samples (pos_ratio * num_sample
) is to be sampled.max_num_gt (int, default is 300) – Maximum ground-truth number for each example. This is only an upper bound, not necessarily very precise. However, using a very big number may impact the training speed.
additional_output (boolean, default is False) –
additional_output
is only used for Mask R-CNN to get internal outputs.force_nms (bool, default is False) – Appy NMS to all categories, this is to avoid overlapping detection results from different categories.
minimal_opset (bool, default is False) – We sometimes add special operators to accelerate training/inference, however, for exporting to third party compilers we want to utilize most widely used operators. If minimal_opset is True, the network will use a minimal set of operators good for e.g., TVM.
-
classes
¶ Names of categories, its length is
num_class
.- Type
iterable of str
-
nms_thresh
¶ Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
- Type
-
nms_topk
¶ - Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
- Type
-
force_nms
¶ Appy NMS to all categories, this is to avoid overlapping detection results from different categories.
- Type
-
rpn_target_generator
¶ Generate training targets with cls_target, box_target, and box_mask.
- Type
gluon.Block
-
target_generator
¶ Generate training targets with boxes, samples, matches, gt_label and gt_box.
- Type
gluon.Block
-
hybrid_forward
(F, x, gt_box=None, gt_label=None)[source]¶ Forward DoubleHeadRCNN-RCNN network.
The behavior during training and inference is different.
- Parameters
- Returns
During inference, returns final class id, confidence scores, bounding boxes.
- Return type
(ids, scores, bboxes)
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors.
- Parameters
classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('faster_rcnn_resnet50_v1b_coco', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the 14th category in VOC >>> net.reset_class(classes=['person'], reuse_weights={0:14}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':14}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
property
target_generator
¶ Returns stored target generator
- Returns
The RCNN target generator
- Return type
mxnet.gluon.HybridBlock
-
class
gluoncv.model_zoo.
DummyMotionEstimator
[source]¶ -
initialize
(first_frame, first_frame_motion_pred_data)[source]¶ Initialize the motion estimator by feeding the first frame
- Parameters
first_frame (data of the first frame) –
first_frame_motion_pred_data (additional data for motion prediction) –
Returns – cache_information
------- –
-
predict_new_locations
(prev_frame_cache: numpy.ndarray, prev_bboxes: numpy.ndarray, new_frame: numpy.ndarray, skip: bool = False, **kwargs)[source]¶ The abstract method for predicting movement of bounding boxes given the two frames. :param prev_frame_cache: :type prev_frame_cache: cached image from motion estimation, numpy.ndarray :param prev_bboxes: :type prev_bboxes: Nx4 numpy.ndarray, bounding boxes in (left, top, right, bottom) format :param new_frame: :type new_frame: BGR image, numpy.ndarray :param new_frame_motion_pred_data: :type new_frame_motion_pred_data: additional data for motion prediction :param tracked_boxes_anchor_indices: :type tracked_boxes_anchor_indices: anchor indices used to build the prev_bboxes :param tracked_boxes_anchor_weights: :type tracked_boxes_anchor_weights: voting weights of anchors used to build prev_bboxes :param skip: :type skip: whether to just skip motion estimation for this frame :param kwargs: :type kwargs: other information :param Returns: new_boxes: Nx4 numpy.ndarray
cache_information:
- Parameters
------- –
-
-
class
gluoncv.model_zoo.
FCN
(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]¶ Fully Convolutional Networks for Semantic Segmentation
- Parameters
nclass (int) – Number of categories for the training dataset.
backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
;norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.pretrained_base (bool or str) – Refers to if the FCN backbone or the encoder is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.
Reference:
Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” CVPR, 2015
Examples
>>> model = FCN(nclass=21, backbone='resnet50') >>> print(model)
-
class
gluoncv.model_zoo.
FarneBeckFlowMotionEstimator
(flow_scale=256)[source]¶ Use the farnebeck algorithm for the flow-based motion estimator
-
compute_flow
(prev_frame_cache, prepared_new_frame)[source]¶ Compute dense optical flow :param prev_frame_cache: :param prepared_new_frame: :param Returns: flow_map: a NxMx2 map. each spatial local contains a 2-element vector
specifying the delta in x and y directions. The unit of delta is pixel in this flow_map’s coordinate space
- Parameters
------- –
-
-
class
gluoncv.model_zoo.
FastSCNN
(nclass, aux=True, ctx=cpu(0), pretrained_base=False, height=None, width=None, base_size=2048, crop_size=1024, **kwargs)[source]¶ Fast-SCNN: Fast Semantic Segmentation Network
- Parameters
Reference:
Rudra P K Poudel, et al. https://bmvc2019.org/wp-content/uploads/papers/0959-paper.pdf “Fast-SCNN: Fast Semantic Segmentation Network.” BMVC, 2019
-
class
gluoncv.model_zoo.
FasterRCNN
(features, top_features, classes, box_features=None, short=600, max_size=1000, min_stage=4, max_stage=4, train_patterns=None, nms_thresh=0.3, nms_topk=400, post_nms=100, roi_mode='align', roi_size=(14, 14), strides=16, clip=None, rpn_channel=1024, base_size=16, scales=(8, 16, 32), ratios=(0.5, 1, 2), alloc_size=(128, 128), rpn_nms_thresh=0.7, rpn_train_pre_nms=12000, rpn_train_post_nms=2000, rpn_test_pre_nms=6000, rpn_test_post_nms=300, rpn_min_size=16, per_device_batch_size=1, num_sample=128, pos_iou_thresh=0.5, pos_ratio=0.25, max_num_gt=300, additional_output=False, force_nms=False, minimal_opset=False, **kwargs)[source]¶ Faster RCNN network.
- Parameters
features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
classes (iterable of str) – Names of categories, its length is
num_class
.box_features (gluon.HybridBlock, default is None) – feature head for transforming shared ROI output (top_features) for box prediction. If set to None, global average pooling will be used.
short (int, default is 600.) – Input image short side size.
max_size (int, default is 1000.) – Maximum size of input image long side.
min_stage (int, default is 4) – Minimum stage NO. for FPN stages.
max_stage (int, default is 4) – Maximum stage NO. for FPN stages.
train_patterns (str, default is None.) – Matching pattern for trainable parameters.
nms_thresh (float, default is 0.3.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
nms_topk (int, default is 400) – Apply NMS to top k detection results, use -1 to disable so that every Detection result is used in NMS.
roi_mode (str, default is align) – ROI pooling mode. Currently support ‘pool’ and ‘align’.
roi_size (tuple of int, length 2, default is (14, 14)) – (height, width) of the ROI region.
strides (int/tuple of ints, default is 16) – Feature map stride with respect to original image. This is usually the ratio between original image size and feature map size. For FPN, use a tuple of ints.
clip (float, default is None) – Clip bounding box prediction to to prevent exponentiation from overflowing.
rpn_channel (int, default is 1024) – number of channels used in RPN convolutional layers.
base_size (int) – The width(and height) of reference anchor box.
scales (iterable of float, default is (8, 16, 32)) –
The areas of anchor boxes. We use the following form to compute the shapes of anchors:
\[width_{anchor} = size_{base} \times scale \times \sqrt{ 1 / ratio} height_{anchor} = size_{base} \times scale \times \sqrt{ratio}\]ratios (iterable of float, default is (0.5, 1, 2)) – The aspect ratios of anchor boxes. We expect it to be a list or tuple.
alloc_size (tuple of int) – Allocate size for the anchor boxes as (H, W). Usually we generate enough anchors for large feature map, e.g. 128x128. Later in inference we can have variable input sizes, at which time we can crop corresponding anchors from this large anchor map so we can skip re-generating anchors for each input.
rpn_train_pre_nms (int, default is 12000) – Filter top proposals before NMS in training of RPN.
rpn_train_post_nms (int, default is 2000) – Return top proposal results after NMS in training of RPN. Will be set to rpn_train_pre_nms if it is larger than rpn_train_pre_nms.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN. Will be set to rpn_test_pre_nms if it is larger than rpn_test_pre_nms.
rpn_nms_thresh (float, default is 0.7) – IOU threshold for NMS. It is used to remove overlapping proposals.
rpn_num_sample (int, default is 256) – Number of samples for RPN targets.
rpn_pos_iou_thresh (float, default is 0.7) – Anchor with IOU larger than
pos_iou_thresh
is regarded as positive samples.rpn_neg_iou_thresh (float, default is 0.3) – Anchor with IOU smaller than
neg_iou_thresh
is regarded as negative samples. Anchors with IOU in betweenpos_iou_thresh
andneg_iou_thresh
are ignored.rpn_pos_ratio (float, default is 0.5) –
pos_ratio
defines how many positive samples (pos_ratio * num_sample
) is to be sampled.rpn_box_norm (array-like of size 4, default is (1., 1., 1., 1.)) – Std value to be divided from encoded values.
rpn_min_size (int, default is 16) – Proposals whose size is smaller than
min_size
will be discarded.per_device_batch_size (int, default is 1) – Batch size for each device during training.
num_sample (int, default is 128) – Number of samples for RCNN targets.
pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than
pos_iou_thresh
is regarded as positive samples.pos_ratio (float, default is 0.25) –
pos_ratio
defines how many positive samples (pos_ratio * num_sample
) is to be sampled.max_num_gt (int, default is 300) – Maximum ground-truth number for each example. This is only an upper bound, not necessarily very precise. However, using a very big number may impact the training speed.
additional_output (boolean, default is False) –
additional_output
is only used for Mask R-CNN to get internal outputs.force_nms (bool, default is False) – Appy NMS to all categories, this is to avoid overlapping detection results from different categories.
minimal_opset (bool, default is False) – We sometimes add special operators to accelerate training/inference, however, for exporting to third party compilers we want to utilize most widely used operators. If minimal_opset is True, the network will use a minimal set of operators good for e.g., TVM.
-
classes
¶ Names of categories, its length is
num_class
.- Type
iterable of str
-
nms_thresh
¶ Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
- Type
-
nms_topk
¶ - Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
- Type
-
force_nms
¶ Appy NMS to all categories, this is to avoid overlapping detection results from different categories.
- Type
-
rpn_target_generator
¶ Generate training targets with cls_target, box_target, and box_mask.
- Type
gluon.Block
-
target_generator
¶ Generate training targets with boxes, samples, matches, gt_label and gt_box.
- Type
gluon.Block
-
hybrid_forward
(F, x, gt_box=None, gt_label=None)[source]¶ Forward Faster-RCNN network.
The behavior during training and inference is different.
- Parameters
- Returns
During inference, returns final class id, confidence scores, bounding boxes.
- Return type
(ids, scores, bboxes)
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors.
- Parameters
classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('faster_rcnn_resnet50_v1b_coco', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the 14th category in VOC >>> net.reset_class(classes=['person'], reuse_weights={0:14}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':14}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
property
target_generator
¶ Returns stored target generator
- Returns
The RCNN target generator
- Return type
mxnet.gluon.HybridBlock
-
class
gluoncv.model_zoo.
ForwardBackwardTask
(net, optimizer, rpn_cls_loss, rpn_box_loss, rcnn_cls_loss, rcnn_box_loss, rcnn_mask_loss, amp_enabled)[source]¶ Mask R-CNN training task that can be scheduled concurrently using Parallel. :param net: Faster R-CNN network. :type net: gluon.HybridBlock :param optimizer: Optimizer for the training. :type optimizer: gluon.Trainer :param rpn_cls_loss: RPN box classification loss. :type rpn_cls_loss: gluon.loss :param rpn_box_loss: RPN box regression loss. :type rpn_box_loss: gluon.loss :param rcnn_cls_loss: R-CNN box head classification loss. :type rcnn_cls_loss: gluon.loss :param rcnn_box_loss: R-CNN box head regression loss. :type rcnn_box_loss: gluon.loss :param rcnn_mask_loss: R-CNN mask head segmentation loss. :type rcnn_mask_loss: gluon.loss :param amp_enabled: Whether to enable Automatic Mixed Precision. :type amp_enabled: bool
-
class
gluoncv.model_zoo.
GluonSSDMultiClassTracktor
(gpu_id=0, detector_thresh=0.5, model_name='', use_pretrained=False, param_path='', data_shape=512)[source]¶ Initiate a tracktor based on an object detetor.
-
prepare_for_frame
(frame)[source]¶ This method should run anything that needs to happen before the motion prediction. It can prepare the detector or even run the backbone feature extractions. It can also provide data to motion prediction :param frame: :type frame: the frame data, the same as in the detect_and_track method
- Returns
motion_predict_data
- Return type
optional data provided to motion prediction, if no data is provided, return None
-
-
class
gluoncv.model_zoo.
GoogLeNet
(classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, dropout_ratio=0.4, aux_logits=False, norm_kwargs=None, partial_bn=False, pretrained_base=True, ctx=None, **kwargs)[source]¶ GoogleNet model from “Going Deeper with Convolutions” paper. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” paper.
- Parameters
classes (int, default 1000) – Number of classification classes.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.partial_bn (bool, default False) – Freeze all batch normalization layers during training except the first layer.
-
class
gluoncv.model_zoo.
HybridBlock
(prefix=None, params=None)[source]¶ HybridBlock supports forwarding with both Symbol and NDArray.
HybridBlock is similar to Block, with a few differences:
import mxnet as mx from mxnet.gluon import HybridBlock, nn class Model(HybridBlock): def __init__(self, **kwargs): super(Model, self).__init__(**kwargs) # use name_scope to give child Blocks appropriate names. with self.name_scope(): self.dense0 = nn.Dense(20) self.dense1 = nn.Dense(20) def hybrid_forward(self, F, x): x = F.relu(self.dense0(x)) return F.relu(self.dense1(x)) model = Model() model.initialize(ctx=mx.cpu(0)) model.hybridize() model(mx.nd.zeros((10, 10), ctx=mx.cpu(0)))
Forward computation in
HybridBlock
must be static to work withSymbol
s, i.e. you cannot callNDArray.asnumpy()
,NDArray.shape
,NDArray.dtype
, NDArray indexing (x[i]) etc on tensors. Also, you cannot use branching or loop logic that bases on non-constant expressions like random numbers or intermediate results, since they change the graph structure for each iteration.Before activating with
hybridize()
,HybridBlock
works just like normalBlock
. After activation,HybridBlock
will create a symbolic graph representing the forward computation and cache it. On subsequent forwards, the cached graph will be used instead ofhybrid_forward()
.Please see references for detailed tutorial.
References
Hybrid - Faster training and easy deployment
-
cast
(dtype)[source]¶ Cast this Block to use another data type.
- Parameters
dtype (str or numpy.dtype) – The new data type.
-
export
(path, epoch=0, remove_amp_cast=True)[source]¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports, mxnet.mod.Module or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
-
forward
(x, *args)[source]¶ Defines the forward computation. Arguments can be either
NDArray
orSymbol
.
-
hybrid_forward
(F, x, *args, **kwargs)[source]¶ Overrides to construct symbolic graph for this Block.
- Parameters
x (Symbol or NDArray) – The first input tensor.
*args (list of Symbol or list of NDArray) – Additional input tensors.
-
hybridize
(active=True, backend=None, backend_opts=None, **kwargs)[source]¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
-
optimize_for
(x, *args, backend=None, backend_opts=None, **kwargs)[source]¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
-
-
class
gluoncv.model_zoo.
I3D_InceptionV1
(nclass=1000, pretrained=False, pretrained_base=True, num_segments=1, num_crop=1, feat_ext=False, dropout_ratio=0.5, init_std=0.01, partial_bn=False, ctx=None, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Inception v1 model from “Going Deeper with Convolutions” paper.
Inflated 3D model (I3D) from “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset” paper. Slight differences between this implementation and the original implementation due to padding.
- Parameters
nclass (int) – Number of classes in the training dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
I3D_InceptionV3
(nclass=1000, pretrained=False, pretrained_base=True, num_segments=1, num_crop=1, feat_ext=False, dropout_ratio=0.5, init_std=0.01, partial_bn=False, ctx=None, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.
Inflated 3D model (I3D) from “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset” paper.
This model definition file is written by Brais and modified by Yi.
- Parameters
nclass (int) – Number of classes in the training dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
I3D_ResNetV1
(nclass, depth, num_stages=4, pretrained=False, pretrained_base=True, feat_ext=False, num_segments=1, num_crop=1, spatial_strides=(1, 2, 2, 2), temporal_strides=(1, 1, 1, 1), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3), conv1_kernel_t=5, conv1_stride_t=2, pool1_kernel_t=1, pool1_stride_t=2, inflate_freq=(1, 1, 1, 1), inflate_stride=(1, 1, 1, 1), inflate_style='3x1x1', nonlocal_stages=(-1, ), nonlocal_freq=(0, 1, 1, 0), nonlocal_cfg=None, bn_eval=True, bn_frozen=False, partial_bn=False, frozen_stages=-1, dropout_ratio=0.5, init_std=0.01, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, ctx=None, **kwargs)[source]¶ ResNet_I3D backbone. Inflated 3D model (I3D) from “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset” paper.
- Parameters
nclass (int.) – Number of categories in the dataset.
depth (int, default is 50.) – Depth of ResNet, from {18, 34, 50, 101, 152}.
num_stages (int, default is 4.) – Number of stages in a ResNet.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
spatial_strides (tuple of int.) – Strides in the spatial dimension of the first block of each stage.
temporal_strides (tuple of int.) – Strides in the temporal dimension of the first block of each stage.
dilations (tuple of int.) – Dilation ratio of each stage.
out_indices (tuple of int.) – Collect features from the selected stages of ResNet, usually used for feature extraction or auxililary loss.
conv1_kernel_t (int, default is 5.) – The kernel size of first convolutional layer in a ResNet.
conv1_stride_t (int, default is 2.) – The stride of first convolutional layer in a ResNet.
pool1_kernel_t (int, default is 1.) – The kernel size of first pooling layer in a ResNet.
pool1_stride_t (int, default is 2.) – The stride of first pooling layer in a ResNet.
inflate_freq (tuple of int.) – Select which 2D convolutional layers to be inflated to 3D convolutional layers in each stage.
inflate_stride (tuple of int.) – The stride for inflated layers in each stage.
inflate_style (str, default is '3x1x1'.) – How to inflate a 2D kernel, either ‘3x1x1’ or ‘1x3x3’.
nonlocal_stages (tuple of int.) – Select which stage we need non-local blocks.
nonlocal_freq (tuple of int.) – Select where to insert non-local blocks in each stage.
nonlocal_cfg (dict.) – Additional non-local arguments, for example nonlocal_type=’gaussian’.
bn_eval (bool.) – Whether to set BN layers to eval mode, namely, freeze running stats (mean and var).
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
frozen_stages (int.) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.ctx (Context, default CPU.) – The context in which to load the pretrained weights.
-
class
gluoncv.model_zoo.
Inception3
(classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, partial_bn=False, **kwargs)[source]¶ Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.
- Parameters
classes (int, default 1000) – Number of classification classes.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
MaskRCNN
(features, top_features, classes, mask_channels=256, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, target_roi_scale=1, num_fcn_convs=0, norm_layer=None, norm_kwargs=None, **kwargs)[source]¶ Mask RCNN network.
- Parameters
features (gluon.HybridBlock) – Base feature extractor before feature pooling layer.
top_features (gluon.HybridBlock) – Tail feature extractor after feature pooling layer.
classes (iterable of str) – Names of categories, its length is
num_class
.mask_channels (int, default is 256) – Number of channels in mask prediction
rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN. Upper bounded by min of rpn_test_pre_nms and rpn_test_post_nms.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 1000) – Return top proposal results after NMS in testing of RPN. Will be set to rpn_test_pre_nms if it is larger than rpn_test_pre_nms.
target_roi_scale (int, default 1) – Ratio of mask output roi / input roi. For model with FPN, this is typically 2.
num_fcn_convs (int, default 0) – number of convolution blocks before deconv layer. For FPN network this is typically 4.
-
hybrid_forward
(F, x, gt_box=None, gt_label=None)[source]¶ Forward Mask RCNN network.
The behavior during training and inference is different.
- Parameters
- Returns
During inference, returns final class id, confidence scores, bounding boxes, segmentation masks.
- Return type
(ids, scores, bboxes, masks)
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors.
- Parameters
classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('mask_rcnn_resnet50_v1b_voc', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the first category in COCO >>> net.reset_class(classes=['person'], reuse_weights={0:0}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':0}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
class
gluoncv.model_zoo.
MobileNet
(multiplier=1.0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.
- Parameters
multiplier (float, default 1.0) – The width multiplier for controlling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
classes (int, default 1000) – Number of classes for the output layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
MobileNetV2
(multiplier=1.0, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper. :param multiplier: The width multiplier for controlling the model size. The actual number of channels
is equal to the original channel size multiplied by this multiplier.
- Parameters
classes (int, default 1000) – Number of classes for the output layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
MobilePose
(base_name, base_attrs=('features'), num_joints=17, pretrained_base=False, pretrained_ctx=cpu(0), **kwargs)[source]¶ Pose Estimation for Mobile Device
-
class
gluoncv.model_zoo.
MonoDepth2
(backbone, pretrained_base, num_input_images=1, scales=range(0, 4), num_output_channels=1, use_skips=True, ctx=cpu(0), **kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’ or ‘resnet152’).
pretrained_base (bool or str) – Refers to if the backbone is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.
num_input_images (int) – The number of input sequences. 1 for depth encoder, larger than 1 for pose encoder. (Default: 1)
scales (list) – The scales used in the loss. (Default: range(4))
num_output_channels (int) – The number of output channels. (Default: 1)
use_skips (bool) – This will use skip architecture in the network. (Default: True)
Reference – Clement Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow. “Digging Into Self-Supervised Monocular Depth Estimation.” ICCV, 2019
Examples
>>> model = MonoDepth2(backbone='resnet18', pretrained_base=True) >>> print(model)
-
class
gluoncv.model_zoo.
MonoDepth2PoseNet
(backbone, pretrained_base, num_input_images=2, num_input_features=1, num_frames_to_predict_for=2, stride=1, ctx=cpu(0), **kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’ or ‘resnet152’).
pretrained_base (bool or str) – Refers to if the backbone is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.
num_input_images (int) – The number of input sequences. 1 for depth encoder, larger than 1 for pose encoder. (Default: 2)
num_input_features (int) – The number of input feature maps from posenet encoder. (Default: 1)
num_frames_to_predict_for (int) – The number of output pose between frames; If None, it equals num_input_features - 1. (Default: 2)
stride (int) – The stride number for Conv in pose decoder. (Default: 1)
Reference – Clement Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow. “Digging Into Self-Supervised Monocular Depth Estimation.” ICCV, 2019
Examples
>>> model = MonoDepth2PoseNet(backbone='resnet18', pretrained_base=True) >>> print(model)
-
class
gluoncv.model_zoo.
P3D
(nclass, block, layers, shortcut_type='B', block_design=('A', 'B', 'C'), dropout_ratio=0.5, num_segments=1, num_crop=1, feat_ext=False, init_std=0.001, ctx=None, partial_bn=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ The Pseudo 3D network (P3D). Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. ICCV, 2017. https://arxiv.org/abs/1711.10305
- Parameters
nclass (int) – Number of classes in the training dataset.
block (Block, default is Bottleneck.) – Class for the residual block.
layers (list of int) – Numbers of layers in each block
block_design (tuple of str.) – Different designs for each block, from ‘A’, ‘B’ or ‘C’.
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
PSPNet
(nclass, backbone='resnet50', aux=True, ctx=cpu(0), pretrained_base=True, base_size=520, crop_size=480, **kwargs)[source]¶ Pyramid Scene Parsing Network
- Parameters
nclass (int) – Number of categories for the training dataset.
backbone (string) – Pre-trained dilated backbone network type (default:’resnet50’; ‘resnet50’, ‘resnet101’ or ‘resnet152’).
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
; for Synchronized Cross-GPU BachNormalization).aux (bool) – Auxiliary loss.
Reference:
Zhao, Hengshuang, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. “Pyramid scene parsing network.” CVPR, 2017
-
class
gluoncv.model_zoo.
PoseDecoder
(num_ch_enc, num_input_features, num_frames_to_predict_for=2, stride=1)[source]¶ Decoder of Monodepth2 PoseNet
- Parameters
num_ch_enc (list) – The channels number of encoder.
num_input_features (int) – The number of input sequences. 1 for depth encoder, larger than 1 for pose encoder. (Default: 2)
num_frames_to_predict_for (int) – The number of output pose between frames; If None, it equals num_input_features - 1. (Default: 2)
stride (int) – The stride number for Conv in pose decoder. (Default: 1)
-
class
gluoncv.model_zoo.
R2Plus1D
(nclass, block, layers, dropout_ratio=0.5, num_segments=1, num_crop=1, feat_ext=False, init_std=0.001, ctx=None, partial_bn=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ The R2+1D network. A Closer Look at Spatiotemporal Convolutions for Action Recognition. CVPR, 2018. https://arxiv.org/abs/1711.11248
- Parameters
nclass (int) – Number of classes in the training dataset.
block (Block, default is Bottleneck.) – Class for the residual block.
layers (list of int) – Numbers of layers in each block
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
RCNNTargetGenerator
(num_class, max_pos=128, per_device_batch_size=1, means=(0.0, 0.0, 0.0, 0.0), stds=(0.1, 0.1, 0.2, 0.2))[source]¶ RCNN target encoder to generate matching target and regression target values.
- Parameters
num_class (int) – Number of total number of positive classes.
max_pos (int, default is 128) – Upper bound of Number of positive samples.
per_device_batch_size (int, default is 1) – Per device batch size
means (iterable of float, default is (0., 0., 0., 0.)) – Mean values to be subtracted from regression targets.
stds (iterable of float, default is (1, 1, 2, 2)) – Standard deviations to be divided from regression targets.
-
hybrid_forward
(F, roi, samples, matches, gt_label, gt_box)[source]¶ Components can handle batch images
- Parameters
roi ((B, N, 4), input proposals) –
samples ((B, N), value +1: positive / -1: negative.) –
matches ((B, N), value [0, M), index to gt_label and gt_box.) –
gt_label ((B, M), value [0, num_class), excluding background class.) –
gt_box ((B, M, 4), input ground truth box corner coordinates.) –
- Returns
cls_target ((B, N), value [0, num_class + 1), including background.)
box_target ((B, N, C, 4), only foreground class has nonzero target.)
box_weight ((B, N, C, 4), only foreground class has nonzero weight.)
-
class
gluoncv.model_zoo.
RCNNTargetSampler
(num_image, num_proposal, num_sample, pos_iou_thresh, pos_ratio, max_num_gt)[source]¶ A sampler to choose positive/negative samples from RCNN Proposals
- Parameters
num_image (int) – Number of input images.
num_proposal (int) – Number of input proposals.
num_sample (int) – Number of samples for RCNN targets.
pos_iou_thresh (float) – Proposal whose IOU larger than
pos_iou_thresh
is regarded as positive samples. Proposal whose IOU smaller thanpos_iou_thresh
is regarded as negative samples.pos_ratio (float) –
pos_ratio
defines how many positive samples (pos_ratio * num_sample
) is to be sampled.max_num_gt (int) – Maximum ground-truth number for each example. This is only an upper bound, not necessarily very precise. However, using a very big number may impact the training speed.
-
hybrid_forward
(F, rois, scores, gt_boxes)[source]¶ Handle B=self._num_image by a for loop.
- Parameters
rois ((B, self._num_proposal, 4) encoded in (x1, y1, x2, y2)) –
scores ((B, self._num_proposal, 1), value range [0, 1] with ignore value -1.) –
gt_boxes ((B, M, 4) encoded in (x1, y1, x2, y2), invalid box should have area of 0.) –
- Returns
rois ((B, self._num_sample, 4), randomly drawn from proposals)
samples ((B, self._num_sample), value +1: positive / 0: ignore / -1: negative.)
matches ((B, self._num_sample), value between [0, M))
-
class
gluoncv.model_zoo.
ResNeSt
(block, layers, cardinality=1, bottleneck_width=64, classes=1000, dilated=False, dilation=1, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, last_gamma=False, deep_stem=False, stem_width=32, avg_down=False, final_drop=0.0, use_global_stats=False, name_prefix='', dropblock_prob=0, input_size=224, use_splat=False, radix=2, avd=False, avd_first=False, split_drop_ratio=0)[source]¶ ResNeSt Model :param block: Class for the residual block. Options are BasicBlockV1, BottleneckV1. :type block: Block :param layers: Numbers of layers in each block :type layers: list of int :param classes: Number of classification classes. :type classes: int, default 1000 :param dilated: Applying dilation strategy to pretrained ResNet yielding a stride-8 model,
typically used in Semantic Segmentation.
- Parameters
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
deep_stem (bool, default False) – Whether to replace the 7x7 conv1 with 3 3x3 convolution layers.
avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.
final_drop (float, default 0.0) – Dropout ratio before the final classification layer.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
Reference –
He, Kaiming, et al. “Deep residual learning for image recognition.”
Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. - Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”
-
class
gluoncv.model_zoo.
ResNetV1
(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
ResNetV1b
(block, layers, classes=1000, dilated=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, last_gamma=False, deep_stem=False, stem_width=32, avg_down=False, final_drop=0.0, use_global_stats=False, name_prefix='', **kwargs)[source]¶ Pre-trained ResNetV1b Model, which produces the strides of 8 featuremaps at conv5.
- Parameters
block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
classes (int, default 1000) – Number of classification classes.
dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
deep_stem (bool, default False) – Whether to replace the 7x7 conv1 with 3 3x3 convolution layers.
avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.
final_drop (float, default 0.0) – Dropout ratio before the final classification layer.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
Reference:
He, Kaiming, et al. “Deep residual learning for image recognition.”
Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”
-
class
gluoncv.model_zoo.
ResNetV2
(block, layers, channels, classes=1000, thumbnail=False, last_gamma=False, use_se=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
block (HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
ResNet_SlowFast
(num_classes, depth, pretrained=None, pretrained_base=True, feat_ext=False, num_segments=1, num_crop=1, num_stages=4, spatial_strides=(1, 2, 2, 2), temporal_strides=(1, 1, 1, 1), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3), conv1_kernel_t=1, conv1_stride_t=1, pool1_kernel_t=1, pool1_stride_t=1, frozen_stages=-1, inflate_freq=(0, 0, 1, 1), inflate_stride=(1, 1, 1, 1), inflate_style='3x1x1', nonlocal_stages=(-1, ), nonlocal_freq=(0, 0, 0, 0), nonlocal_cfg=None, bn_eval=False, bn_frozen=False, partial_bn=False, dropout_ratio=0.5, init_std=0.01, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, ctx=None, **kwargs)[source]¶ ResNe(x)t_SlowFast backbone. :param depth: Depth of resnet, from {50, 101, 152}. :type depth: int :param num_stages: Resnet stages, normally 4. :type num_stages: int :param strides: Strides of the first block of each stage. :type strides: Sequence[int] :param dilations: Dilation of each stage. :type dilations: Sequence[int] :param out_indices: Output from which stages. :type out_indices: Sequence[int] :param frozen_stages: Stages to be frozen (all param fixed). -1 means
not freezing any parameters.
- Parameters
-
class
gluoncv.model_zoo.
ResNext
(layers, cardinality, bottleneck_width, classes=1000, last_gamma=False, use_se=False, deep_stem=False, avg_down=False, stem_width=64, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ ResNext model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
layers (list of int) – Numbers of layers in each block
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
classes (int, default 1000) – Number of classification classes.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
deep_stem (bool, default False) – Whether to replace the 7x7 conv1 with 3 3x3 convolution layers.
stem_width (int, default 64) – Width of the stem intermediate layer.
avg_down (bool, default False) – Whether to use average pooling for projection skip connection between stages/downsample.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
ResidualAttentionModel
(scale, m, classes=1000, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper. Input size is 224 x 224.
- Parameters
scale (tuple) – Network scale p, t, r.
m (tuple) – Network scale m.Network scale is defined as 36m + 20. And normally m is a tuple of (m-1, m, m+1) except m==1 as (1, 1, 1).
classes (int, default 1000) – Number of classification classes.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
ResnetEncoder
(backbone, pretrained, num_input_images=1, root='/root/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Encoder of Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’ or ‘resnet152’).
pretrained (bool or str) – Refers to if the backbone is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.
num_input_images (int) – The number of input sequences. 1 for depth encoder, larger than 1 for pose encoder. (Default: 1)
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
-
class
gluoncv.model_zoo.
SE_BasicBlockV1
(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SE_BasicBlockV2
(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SE_BottleneckV1
(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for SE_ResNet V1 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SE_BottleneckV2
(channels, stride, downsample=False, in_channels=0, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for SE_ResNet V2 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SE_ResNetV1
(block, layers, channels, classes=1000, thumbnail=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SE_ResNetV2
(block, layers, channels, classes=1000, thumbnail=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
block (HybridBlock) – Class for the residual block. Options are SE_BasicBlockV1, SE_BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
class
gluoncv.model_zoo.
SMOTTracker
(motion_model='no', anchor_array=None, use_motion=True, tracking_classes=[], match_top_k=10, track_keep_alive_thresh=0.1, new_track_iou_thresh=0.3, track_nms_thresh=0.5, gpu_id=0, anchor_assignment_method='iou', joint_linking=False, tracktor=None)[source]¶ Implementation of the SMOT tracker The steps to use the tracker is: 0. Set anchors from the SSD 1. First call tracker.predict(new_frame) 2. Then get the tracking anchor information 3. Run the detractor with the tracking anchor information 4. Run tracker.update(new_detection, track_info).
-
class
gluoncv.model_zoo.
SSD
(network, base_size, features, num_filters, sizes, ratios, steps, classes, use_1x1_transition=True, use_bn=True, reduce_ratio=1.0, min_depth=128, global_pool=False, pretrained=False, stds=(0.1, 0.1, 0.2, 0.2), nms_thresh=0.45, nms_topk=400, post_nms=100, anchor_alloc_size=128, ctx=cpu(0), norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, root='~/.mxnet/models', minimal_opset=False, predictors_kernel=(3, 3), predictors_pad=(1, 1), anchor_generator=<class 'gluoncv.model_zoo.ssd.anchor.SSDAnchorGenerator'>, **kwargs)[source]¶ Single-shot Object Detection Network: https://arxiv.org/abs/1512.02325.
- Parameters
network (string or None) – Name of the base network, if None is used, will instantiate the base network from features directly instead of composing.
base_size (int) – Base input size, it is speficied so SSD can support dynamic input shapes.
features (list of str or mxnet.gluon.HybridBlock) – Intermediate features to be extracted or a network with multi-output. If network is None, features is expected to be a multi-output network.
num_filters (list of int) – Number of channels for the appended layers, ignored if network`is `None.
sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have
sizes = [30, 60, 90]
, and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
steps (list of int) – Step size of anchor boxes in each output layer.
classes (iterable of str) – Names of all categories.
use_1x1_transition (bool) – Whether to use 1x1 convolution as transition layer between attached layers, it is effective reducing model capacity.
use_bn (bool) – Whether to use BatchNorm layer after each attached convolutional layer.
reduce_ratio (float) – Channel reduce ratio (0, 1) of the transition layer.
min_depth (int) – Minimum channels for the transition layers.
global_pool (bool) – Whether to attach a global average pooling layer as the last output layer.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
stds (tuple of float, default is (0.1, 0.1, 0.2, 0.2)) – Std values to be divided/multiplied to box encoded values.
nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
nms_topk (int, default is 400) –
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
anchor_alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define anchor_alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, scalar, etc.
ctx (mx.Context) – Network context.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
. This will only apply to base networks that has norm_layer specified, will ignore if the base network (e.g. VGG) don’t accept this argument.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.root (str) – The root path for model storage, default is ‘~/.mxnet/models’
minimal_opset (bool) – We sometimes add special operators to accelerate training/inference, however, for exporting to third party compilers we want to utilize most widely used operators. If minimal_opset is True, the network will use a minimal set of operators good for e.g., TVM.
predictor_kernel (tuple of int. default is (3,3)) – Dimension of predictor kernel
predictor_pad (tuple of int. default is (1,1)) – Padding of the predictor kenrel conv.
anchor_generator (default is SSDAnchorGenerator) – Anchor Generator to be used. The default it SSDAnchorGenerator corresponding to SSD published article. This argument can be used for other custom anchor generators. Like LiteAnchorGenerator.
-
property
num_classes
¶ Return number of foreground classes.
- Returns
Number of foreground classes
- Return type
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors.
- Parameters
classes (iterable of str) – The new categories. [‘apple’, ‘orange’] for example.
reuse_weights (dict) – A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict, or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('ssd_512_resnet50_v1_voc', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the 14th category in VOC >>> net.reset_class(classes=['person'], reuse_weights={0:14}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':14}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
set_nms
(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]¶ Set non-maximum suppression parameters.
- Parameters
nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
nms_topk (int, default is 400) –
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
- Returns
- Return type
-
class
gluoncv.model_zoo.
SimplePoseResNet
(base_name='resnet50_v1b', pretrained_base=False, pretrained_ctx=cpu(0), num_joints=17, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), final_conv_kernel=1, deconv_with_bias=False, **kwargs)[source]¶
-
class
gluoncv.model_zoo.
SlowFast
(nclass, block=<class 'gluoncv.model_zoo.action_recognition.slowfast.Bottleneck'>, layers=None, num_block_temp_kernel_fast=None, num_block_temp_kernel_slow=None, pretrained=False, pretrained_base=False, feat_ext=False, num_segments=1, num_crop=1, bn_eval=True, bn_frozen=False, partial_bn=False, frozen_stages=-1, dropout_ratio=0.5, init_std=0.01, alpha=8, beta_inv=8, fusion_conv_channel_ratio=2, fusion_kernel_size=5, width_per_group=64, num_groups=1, slow_temporal_stride=16, fast_temporal_stride=2, slow_frames=4, fast_frames=32, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, ctx=None, **kwargs)[source]¶ SlowFast networks (SlowFast) from “SlowFast Networks for Video Recognition” paper.
- Parameters
nclass (int.) – Number of categories in the dataset.
block (a HybridBlock.) – Building block of a ResNet, could be Basic or Bottleneck.
layers (a list or tuple, default is None.) – Number of stages in a ResNet, e.g., [3, 4, 6, 3] in ResNet50.
num_block_temp_kernel_fast (int, default is None.) – If the current block has more than NUM_BLOCK_TEMP_KERNEL blocks, use temporal kernel of 1 for the rest of the blocks.
num_block_temp_kernel_slow (int, default is None.) – If the current block has more than NUM_BLOCK_TEMP_KERNEL blocks, use temporal kernel of 1 for the rest of the blocks.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
bn_eval (bool.) – Whether to set BN layers to eval mode, namely, freeze running stats (mean and var).
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
frozen_stages (int.) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.
dropout_ratio (float, default is 0.5.) – The dropout rate of a dropout layer. The larger the value, the more strength to prevent overfitting.
init_std (float, default is 0.001.) – Standard deviation value when initialize the dense layers.
alpha (int, default is 8.) – Corresponds to the frame rate reduction ratio between the Slow and Fast pathways.
beta_inv (int, default is 8.) – Corresponds to the inverse of the channel reduction ratio between the Slow and Fast pathways.
fusion_conv_channel_ratio (int, default is 2.) – Ratio of channel dimensions between the Slow and Fast pathways.
fusion_kernel_size (int, default is 5.) – Kernel dimension used for fusing information from Fast pathway to Slow pathway.
width_per_group (int, default is 64.) – Width of each group (64 -> ResNet; 4 -> ResNeXt).
num_groups (int, default is 1.) – Number of groups for the convolution. Num_groups=1 is for standard ResNet like networks, and num_groups>1 is for ResNeXt like networks.
slow_temporal_stride (int, default 16.) – The temporal stride for sparse sampling of video frames in slow branch of a SlowFast network.
fast_temporal_stride (int, default 2.) – The temporal stride for sparse sampling of video frames in fast branch of a SlowFast network.
slow_frames (int, default 4.) – The number of frames used as input to a slow branch.
fast_frames (int, default 32.) – The number of frames used as input to a fast branch.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.ctx (Context, default CPU.) – The context in which to load the pretrained weights.
-
class
gluoncv.model_zoo.
SqueezeNet
(version, classes=1000, **kwargs)[source]¶ SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.
- Parameters
-
class
gluoncv.model_zoo.
Track
(mean, track_id, source, keep_alive_thresh=0.1, max_missing=30, attributes=None, class_id=0, linked_id=None)[source]¶ This class represents a track/tracklet used in the SMOT Tracker It has the following properties
mean: 4-tuple representing the (x0, y0, x1, y1) as the current state (location) of the tracked object track_id: the numerical id of the track age: the number of timesteps since its first occurrence time_since_update: number of time-steps since the last update of the its location state: the state of the track, can be one in TrackState confidence_score: tracking_confidence at the current timestep
source: a tuple of (anchor_indices, anchor_weights) attributes: np.ndarray of additional attributes of the object ***************************************************
It also has these configs keep_alive_thresh: the minimal tracking/detection confidence to keep the track in Active state max_missing: the maximal timesteps we will keep searching for this track when missing before we mark it as deleted ***************************************************
-
predict
(motion_model=None)[source]¶ - Parameters
motion_model (if not None, predict the motion of this track given its history) –
-
update
(bbx, source=None, attributes=None)[source]¶ Update the state of the track. We override the predicted track position. Updating the track will keep or flip its state as Active If the confidence of detection is below the keep_alive_threshold, we will mark this track as missed. ———- bbx : new detection location of this object attributes: some useful attributes of this object at this frame, e.g. landmarks
-
-
class
gluoncv.model_zoo.
VGG
(layers, filters, classes=1000, batch_norm=False, **kwargs)[source]¶ VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
-
class
gluoncv.model_zoo.
VGGAtrousExtractor
(layers, filters, extras, batch_norm=False, **kwargs)[source]¶ VGG Atrous multi layer feature extractor which produces multiple output feature maps.
- Parameters
layers (list of int) – Number of layer for vgg base network.
filters (list of int) – Number of convolution filters for each layer.
extras (list of list) – Extra layers configurations.
batch_norm (bool) – If True, will use BatchNorm layers.
-
class
gluoncv.model_zoo.
Xception65
(classes=1000, output_stride=32, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None)[source]¶ Modified Aligned Xception
-
class
gluoncv.model_zoo.
Xception71
(classes=1000, output_stride=32, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None)[source]¶ Modified Aligned Xception
-
class
gluoncv.model_zoo.
YOLOV3
(stages, channels, anchors, strides, classes, alloc_size=(128, 128), nms_thresh=0.45, nms_topk=400, post_nms=100, pos_iou_thresh=1.0, ignore_iou_thresh=0.7, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO V3 detection network. Reference: https://arxiv.org/pdf/1804.02767.pdf. :param stages: Staged feature extraction blocks.
For example, 3 stages and 3 YOLO output layers are used original paper.
- Parameters
channels (iterable) – Number of conv channels for each appended stage. len(channels) should match len(stages).
num_class (int) – Number of foreground objects.
anchors (iterable) – The anchor setting. len(anchors) should match len(stages).
strides (iterable) – Strides of feature map. len(strides) should match len(stages).
alloc_size (tuple of int, default is (128, 128)) – For advanced users. Define alloc_size to generate large enough anchor maps, which will later saved in parameters. During inference, we support arbitrary input image by cropping corresponding area of the anchor map. This allow us to export to symbol so we can run it in c++, Scalar, etc.
nms_thresh (float, default is 0.45.) – Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS.
nms_topk (int, default is 400) –
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
pos_iou_thresh (float, default is 1.0) – IOU threshold for true anchors that match real objects. ‘pos_iou_thresh < 1’ is not implemented.
ignore_iou_thresh (float) – Anchors that has IOU in range(ignore_iou_thresh, pos_iou_thresh) don’t get penalized of objectness score.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
property
classes
¶ Return names of (non-background) categories. :returns: Names of (non-background) categories. :rtype: iterable of str
-
hybrid_forward
(F, x, *args)[source]¶ YOLOV3 network hybrid forward. :param F: F is mxnet.sym if hybridized or mxnet.nd if not. :type F: mxnet.nd or mxnet.sym :param x: Input data. :type x: mxnet.nd.NDArray :param *args: During training, extra inputs are required:
(gt_boxes, obj_t, centers_t, scales_t, weights_t, clas_t) These are generated by YOLOV3PrefetchTargetGenerator in dataloader transform function.
- Returns
During inference, return detections in shape (B, N, 6) with format (cid, score, xmin, ymin, xmax, ymax) During training, return losses only: (obj_loss, center_loss, scale_loss, cls_loss).
- Return type
(tuple of) mxnet.nd.NDArray
-
property
num_class
¶ Number of (non-background) categories. :returns: Number of (non-background) categories. :rtype: int
-
reset_class
(classes, reuse_weights=None)[source]¶ Reset class categories and class predictors. :param classes: The new categories. [‘apple’, ‘orange’] for example. :type classes: iterable of str :param reuse_weights: A {new_integer : old_integer} or mapping dict or {new_name : old_name} mapping dict,
or a list of [name0, name1,…] if class names don’t change. This allows the new predictor to reuse the previously trained weights specified.
Example
>>> net = gluoncv.model_zoo.get_model('yolo3_darknet53_voc', pretrained=True) >>> # use direct name to name mapping to reuse weights >>> net.reset_class(classes=['person'], reuse_weights={'person':'person'}) >>> # or use interger mapping, person is the 14th category in VOC >>> net.reset_class(classes=['person'], reuse_weights={0:14}) >>> # you can even mix them >>> net.reset_class(classes=['person'], reuse_weights={'person':14}) >>> # or use a list of string if class name don't change >>> net.reset_class(classes=['person'], reuse_weights=['person'])
-
set_nms
(nms_thresh=0.45, nms_topk=400, post_nms=100)[source]¶ Set non-maximum suppression parameters. :param nms_thresh: Non-maximum suppression threshold. You can specify < 0 or > 1 to disable NMS. :type nms_thresh: float, default is 0.45. :param nms_topk:
- Apply NMS to top k detection results, use -1 to disable so that every Detection
result is used in NMS.
- Parameters
post_nms (int, default is 100) – Only return top post_nms detection results, the rest is discarded. The number is based on COCO dataset which has maximum 100 objects per image. You can adjust this number if expecting more objects. You can use -1 to return all detections.
- Returns
- Return type
-
gluoncv.model_zoo.
abstractmethod
(funcobj)[source]¶ A decorator indicating abstract methods.
Requires that the metaclass is ABCMeta or derived from it. A class that has a metaclass derived from ABCMeta cannot be instantiated unless all of its abstract methods are overridden. The abstract methods can be called using any of the normal ‘super’ call mechanisms.
Usage:
- class C(metaclass=ABCMeta):
@abstractmethod def my_abstract_method(self, …):
…
-
gluoncv.model_zoo.
alexnet
(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ AlexNet model from the “One weird trick…” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
gluoncv.model_zoo.
bbox_iou
(bbox_a, bbox_b, offset=0)[source]¶ Calculate Intersection-Over-Union(IOU) of two bounding boxes.
- Parameters
bbox_a (numpy.ndarray) – An ndarray with shape \((N, 4)\).
bbox_b (numpy.ndarray) – An ndarray with shape \((M, 4)\).
offset (float or int, default is 0) – The
offset
is used to control the whether the width(or height) is computed as (right - left +offset
). Note that the offset must be 0 for normalized bboxes, whose ranges are in[0, 1]
.
- Returns
An ndarray with shape \((N, M)\) indicates IOU between each pairs of bounding boxes in bbox_a and bbox_b.
- Return type
-
gluoncv.model_zoo.
c3d_kinetics400
(nclass=400, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, **kwargs)[source]¶ The Convolutional 3D network (C3D) trained on Kinetics400 dataset. Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV, 2015. https://arxiv.org/abs/1412.0767
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
center_net_dla34_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with dla34 base network on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_dla34_dcnv2_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with dla34 base network with deformable v2 conv layers on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_dla34_dcnv2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with dla34 base network with deformable conv layers on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_dla34_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with dla34 base network on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_mobilenetv3_large_duc_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with mobilenetv3_large base network on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_mobilenetv3_large_duc_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with mobilenetv3_large base network on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_mobilenetv3_small_duc_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with mobilenetv3_small base network with DUC layers on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_mobilenetv3_small_duc_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with mobilenetv3_small base network with DUC layers on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet101_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet101_v1b base network on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet101_v1b_dcnv2_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet101_v1b base network with deformable v2 conv layers on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet101_v1b_dcnv2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet101_v1b base network with deformable conv layers on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet101_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet101_v1b base network on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet18_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet18_v1b base network on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet18_v1b_dcnv2_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet18_v1b base network with deformable v2 conv layer on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet18_v1b_dcnv2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet18_v1b base network with deformable v2 conv layers on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet18_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet18_v1b base network on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet50_v1b base network on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet50_v1b_dcnv2_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet50_v1b base network with deformable v2 conv layers on coco dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet50_v1b_dcnv2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet50_v1b base network with deformable conv layers on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
center_net_resnet50_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Center net with resnet50_v1b base network on voc dataset.
- Parameters
- Returns
A CenterNet detection network.
- Return type
-
class
gluoncv.model_zoo.
cifar_ResidualAttentionModel
(scale, m, classes=10, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper. Input size is 32 x 32.
- Parameters
scale (tuple) – Network scale p, t, r.
m (tuple) – Network scale m.Network scale is defined as 36m + 20. And normally m is a tuple of (m-1, m, m+1) except m==1 as (1, 1, 1).
classes (int, default 10) – Number of classification classes.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_residualattentionnet452
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_residualattentionnet56
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_residualattentionnet92
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet110_v1
(**kwargs)[source]¶ ResNet-110 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet110_v2
(**kwargs)[source]¶ ResNet-110 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet20_v1
(**kwargs)[source]¶ ResNet-20 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet20_v2
(**kwargs)[source]¶ ResNet-20 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet56_v1
(**kwargs)[source]¶ ResNet-56 V1 model for CIFAR10 from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_resnet56_v2
(**kwargs)[source]¶ ResNet-56 V2 model for CIFAR10 from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_wideresnet16_10
(**kwargs)[source]¶ WideResNet-16-10 model for CIFAR10 from “Wide Residual Networks” paper.
- Parameters
drop_rate (float) – The rate of dropout.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_wideresnet28_10
(**kwargs)[source]¶ WideResNet-28-10 model for CIFAR10 from “Wide Residual Networks” paper.
- Parameters
drop_rate (float) – The rate of dropout.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cifar_wideresnet40_8
(**kwargs)[source]¶ WideResNet-40-8 model for CIFAR10 from “Wide Residual Networks” paper.
- Parameters
drop_rate (float) – The rate of dropout.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
cpu
(device_id=0)[source]¶ Returns a CPU context.
This function is a short cut for
Context('cpu', device_id)
. For most operations, when no context is specified, the default context is cpu().Examples
>>> with mx.cpu(): ... cpu_array = mx.nd.ones((2, 3)) >>> cpu_array.context cpu(0) >>> cpu_array = mx.nd.ones((2, 3), ctx=mx.cpu()) >>> cpu_array.context cpu(0)
- Parameters
device_id (int, optional) – The device id of the device. device_id is not needed for CPU. This is included to make interface compatible with GPU.
- Returns
context – The corresponding CPU context.
- Return type
Context
-
gluoncv.model_zoo.
custom_faster_rcnn_fpn
(classes, transfer=None, dataset='custom', pretrained_base=True, base_network_name='resnet18_v1b', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, sym_norm_layer=None, sym_norm_kwargs=None, num_fpn_filters=256, num_box_head_conv=4, num_box_head_conv_filters=256, num_box_head_dense_filters=1024, **kwargs)[source]¶ Faster RCNN model with resnet base network and FPN on custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
transfer (str or None) – Dataset from witch to transfer from. If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other dataset, specified by the parameter.
dataset (str, default 'custom') – Dataset name attached to the network name
pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
base_network_name (str, default 'resnet18_v1b') – base network for mask RCNN. Currently support: ‘resnet18_v1b’, ‘resnet50_v1b’, and ‘resnet101_v1d’
norm_layer (nn.HybridBlock, default nn.BatchNorm) – Gluon normalization layer to use. Default is frozen batch normalization layer.
norm_kwargs (dict) – Keyword arguments for gluon normalization layer
sym_norm_layer (nn.SymbolBlock, default None) – Symbol normalization layer to use in FPN. This is due to FPN being implemented using SymbolBlock. Default is None, meaning no normalization layer will be used in FPN.
sym_norm_kwargs (dict) – Keyword arguments for symbol normalization layer used in FPN.
num_fpn_filters (int, default 256) – Number of filters for FPN output layers.
num_box_head_conv (int, default 4) – Number of convolution layers to use in box head if batch normalization is not frozen.
num_box_head_conv_filters (int, default 256) – Number of filters for convolution layers in box head. Only applicable if batch normalization is not frozen.
num_box_head_dense_filters (int, default 1024) – Number of hidden units for the last fully connected layer in box head.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
- Returns
Hybrid faster RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
custom_mask_rcnn_fpn
(classes, transfer=None, dataset='custom', pretrained_base=True, base_network_name='resnet18_v1b', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, sym_norm_layer=None, sym_norm_kwargs=None, num_fpn_filters=256, num_box_head_conv=4, num_box_head_conv_filters=256, num_box_head_dense_filters=1024, **kwargs)[source]¶ Mask RCNN model with resnet base network and FPN on custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
transfer (str or None) – Dataset from witch to transfer from. If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other dataset, specified by the parameter.
dataset (str, default 'custom') – Dataset name attached to the network name
pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
base_network_name (str, default 'resnet18_v1b') – base network for mask RCNN. Currently support: ‘resnet18_v1b’, ‘resnet50_v1b’, and ‘resnet101_v1d’
norm_layer (nn.HybridBlock, default nn.BatchNorm) – Gluon normalization layer to use. Default is frozen batch normalization layer.
norm_kwargs (dict) – Keyword arguments for gluon normalization layer
sym_norm_layer (nn.SymbolBlock, default None) – Symbol normalization layer to use in FPN. This is due to FPN being implemented using SymbolBlock. Default is None, meaning no normalization layer will be used in FPN.
sym_norm_kwargs (dict) – Keyword arguments for symbol normalization layer used in FPN.
num_fpn_filters (int, default 256) – Number of filters for FPN output layers.
num_box_head_conv (int, default 4) – Number of convolution layers to use in box head if batch normalization is not frozen.
num_box_head_conv_filters (int, default 256) – Number of filters for convolution layers in box head. Only applicable if batch normalization is not frozen.
num_box_head_dense_filters (int, default 1024) – Number of hidden units for the last fully connected layer in box head.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
- Returns
Hybrid faster RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
custom_ssd
(base_network_name, base_size, filters, sizes, ratios, steps, classes, dataset, pretrained_base, **kwargs)[source]¶ Custom SSD models.
-
gluoncv.model_zoo.
custom_yolov3
(base_network_name, filters, anchors, strides, classes, dataset, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ Custom YOLO models.
-
gluoncv.model_zoo.
darknet53
(**kwargs)[source]¶ Darknet v3 53 layer network. Reference: https://arxiv.org/pdf/1804.02767.pdf.
- Parameters
- Returns
Darknet network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
densenet121
(**kwargs)[source]¶ Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
densenet161
(**kwargs)[source]¶ Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
densenet169
(**kwargs)[source]¶ Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
densenet201
(**kwargs)[source]¶ Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
doublehead_rcnn_resnet50_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Double Head Faster RCNN model from the paper “(2019). Rethinking Classification and Localization for Object Detection.”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_resnet50_v1b_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_resnet101_v1d_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_resnet101_v1d_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_resnet50_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_syncbn_resnest101_coco
(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]¶ Faster R-CNN with ResNeSt ResNeSt: Split Attention Network”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_syncbn_resnest101_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_syncbn_resnest269_coco
(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]¶ Faster R-CNN with ResNeSt ResNeSt: Split Attention Network”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_syncbn_resnest269_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_syncbn_resnest50_coco
(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]¶ Faster R-CNN with ResNeSt ResNeSt: Split Attention Network”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_syncbn_resnest50_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_syncbn_resnet101_v1d_coco
(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]¶ Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_syncbn_resnet101_v1d_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_fpn_syncbn_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, num_devices=0, **kwargs)[source]¶ Faster RCNN model with FPN from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks” “Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2016). Feature Pyramid Networks for Object Detection”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_fpn_syncbn_resnet50_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_resnet101_v1d_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”
- Parameters
pretrained (bool, optional, default is False) – Load pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_resnet101_v1d_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_resnet101_v1d_custom
(classes, transfer=None, pretrained_base=True, pretrained=False, **kwargs)[source]¶ Faster RCNN model with resnet101_v1d base network on custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
transfer (str or None) – If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other datasets.
pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
- Returns
Hybrid faster RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
faster_rcnn_resnet101_v1d_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”
- Parameters
pretrained (bool, optional, default is False) – Load pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_resnet101_v1d_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_resnet50_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
faster_rcnn_resnet50_v1b_custom
(classes, transfer=None, pretrained_base=True, pretrained=False, **kwargs)[source]¶ Faster RCNN model with resnet50_v1b base network on custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
transfer (str or None) – If not None, will try to reuse pre-trained weights from faster RCNN networks trained on other datasets.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
- Returns
Hybrid faster RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
faster_rcnn_resnet50_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Faster RCNN model from the paper “Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_faster_rcnn_resnet50_v1b_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_Siam_RPN
(base_name, bz=1, is_train=False, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ get Siam_RPN net and get pretrained model if have pretrained
- Parameters
base_name (str) – Backbone model name
bz (int) – batch size for train, bz = 1 if test
is_train (str) – is_train is True if train, False if test
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
A SiamRPN Tracking network.
- Return type
-
gluoncv.model_zoo.
get_center_net
(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Get a center net instance.
- Parameters
name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.
dataset (str) – Name of dataset. This is used to identify model name because models trained on different datasets are going to be very different.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
A CenterNet detection network.
- Return type
-
gluoncv.model_zoo.
get_cifar_resnet
(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
version (int) – Version of ResNet. Options are 1, 2.
num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_cifar_wide_resnet
(num_layers, width_factor=1, drop_rate=0.0, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
num_layers (int) – Numbers of layers. Needs to be an integer in the form of 6*n+2, e.g. 20, 56, 110, 164.
width_factor (int) – The width factor to apply to the number of channels from the original resnet.
drop_rate (float) – The rate of dropout.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_darknet
(darknet_version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Get darknet by version and num_layers info.
- Parameters
darknet_version (str) – Darknet version, choices are [‘v3’].
num_layers (int) – Number of layers.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Darknet network.
- Return type
mxnet.gluon.HybridBlock
Examples
>>> model = get_darknet('v3', 53, pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab
(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ DeepLabV3 :param dataset: The dataset that model pretrained on. (pascal_voc, pascal_aug, ade20k, coco, citys) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_plus
(dataset='pascal_voc', backbone='xception', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ DeepLabV3Plus :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn(dataset='pascal_voc', backbone='xception', pretrained=False) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_plus_xception_coco
(**kwargs)[source]¶ DeepLabV3Plus :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_plus_xception_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnest101_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnest101_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnest200_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnest200_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnest269_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnest269_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnest50_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnest50_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet101_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet101_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet101_citys
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet101_citys(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet101_coco
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet101_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet101_voc
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet101_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet152_coco
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet152_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet152_voc
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet152_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet50_ade
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet50_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_resnet50_citys
(**kwargs)[source]¶ DeepLabV3 :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_resnet50_citys(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplab_v3b_plus_wideresnet_citys
(**kwargs)[source]¶ DeepLabWV3Plus :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplab_v3b_plus_wideresnet_citys(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_deeplabv3b_plus
(dataset='citys', backbone='wideresnet', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ DeepLabWV3Plus :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k, citys) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_deeplabv3b_plus(dataset='citys', backbone='wideresnet', pretrained=False) >>> print(model)
-
gluoncv.model_zoo.
get_doublehead_rcnn
(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Utility function to return faster rcnn networks.
- Parameters
name (str) – Model name.
dataset (str) – The name of dataset.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
The DoubleHeadRCNN-RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
get_faster_rcnn
(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Utility function to return faster rcnn networks.
- Parameters
name (str) – Model name.
dataset (str) – The name of dataset.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
The Faster-RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
get_fastscnn
(dataset='citys', ctx=cpu(0), pretrained=False, root='~/.mxnet/models', **kwargs)[source]¶ Fast-SCNN: Fast Semantic Segmentation Network :param dataset: :type dataset: str, default cityscapes :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fastscnn(dataset='citys') >>> print(model)
-
gluoncv.model_zoo.
get_fastscnn_citys
(**kwargs)[source]¶ Fast-SCNN: Fast Semantic Segmentation Network :param dataset: :type dataset: str, default cityscapes :param ctx: The context in which to load the pretrained weights. :type ctx: Context, default CPU
Examples
>>> model = get_fastscnn_citys() >>> print(model)
-
gluoncv.model_zoo.
get_fcn
(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]¶ FCN model from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
dataset (str, default pascal_voc) – The dataset that model pretrained on. (pascal_voc, ade20k)
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
pretrained_base (bool or str, default True) – This will load pretrained backbone network, that was trained on ImageNet.
Examples
>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False) >>> print(model)
-
gluoncv.model_zoo.
get_fcn_resnet101_ade
(**kwargs)[source]¶ FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn_resnet50_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_fcn_resnet101_coco
(**kwargs)[source]¶ FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn_resnet101_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_fcn_resnet101_voc
(**kwargs)[source]¶ FCN model with base network ResNet-101 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn_resnet101_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_fcn_resnet50_ade
(**kwargs)[source]¶ FCN model with base network ResNet-50 pre-trained on ADE20K dataset from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn_resnet50_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_fcn_resnet50_voc
(**kwargs)[source]¶ FCN model with base network ResNet-50 pre-trained on Pascal VOC dataset from the paper “Fully Convolutional Network for semantic segmentation”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_fcn_resnet50_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_hrnet
(model_name, stage_interp_type='nearest', purpose='cls', pretrained=False, ctx=cpu(0), root='~/.mxnet/models', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, num_classes=1000, **kwargs)[source]¶ HRNet model from the “Deep High-Resolution Representation Learning for Visual Recognition” paper.
- Parameters
model_name (string) – The name of hrnet models: w18_small_v1/w18_small_v2/w30/w32/w40/w42/w48.
stage_interp_type (string) – The interpolation type for upsample in each stage, nearest, bilinear and bilinear_like are supported.
purpose (string) – The purpose of model, cls and seg are supported.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_mask_rcnn
(name, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Utility function to return mask rcnn networks.
- Parameters
name (str) – Model name.
dataset (str) – The name of dataset.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
The Mask RCNN network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
get_mobilenet
(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.
- Parameters
multiplier (float) – The width multiplier for controlling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_mobilenet_v2
(multiplier, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper.
- Parameters
multiplier (float) – The width multiplier for controlling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_model
(name, **kwargs)[source]¶ Returns a pre-defined model by name
- Parameters
name (str) – Name of the model.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
classes (int) – Number of classes for the output layer.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
- Returns
The model.
- Return type
-
gluoncv.model_zoo.
get_model_list
()[source]¶ Get the entire list of model names in model_zoo.
- Returns
Entire list of model names in model_zoo.
- Return type
list of str
-
gluoncv.model_zoo.
get_monodepth2
(backbone='resnet18', pretrained_base=True, scales=range(0, 4), num_output_channels=1, use_skips=True, root='~/.mxnet/models', ctx=cpu(0), pretrained=False, pretrained_model='kitti_stereo_640x192', **kwargs)[source]¶ MonoDepth2
- Parameters
backbone (string, default:'resnet18') – Pre-trained dilated backbone network type (‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’ or ‘resnet152’).
pretrained_base (bool or str, default: True) – This will load pretrained backbone network, that was trained on ImageNet.
scales (list, default: range(4)) – The scales used in the loss.
num_output_channels (int, default: 1) – The number of output channels.
use_skips (bool, default: True) – This will use skip architecture in the network.
ctx (Context, default: CPU) – The context in which to load the pretrained weights.
root (str, default: '~/.mxnet/models') – Location for keeping the model parameters.
pretrained (bool or str, default: False) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_model (string, default: kitti_stereo_640x192) – The dataset that model pretrained on.
-
gluoncv.model_zoo.
get_monodepth2_resnet18_kitti_mono_640x192
(**kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (default:’resnet18’).
-
gluoncv.model_zoo.
get_monodepth2_resnet18_kitti_mono_stereo_640x192
(**kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (default:’resnet18’).
-
gluoncv.model_zoo.
get_monodepth2_resnet18_kitti_stereo_640x192
(**kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (default:’resnet18’).
-
gluoncv.model_zoo.
get_monodepth2_resnet18_posenet_kitti_mono_640x192
(**kwargs)[source]¶ Monodepth2 PoseNet
- Parameters
backbone (string) – Pre-trained dilated backbone network type (default:’resnet18’).
-
gluoncv.model_zoo.
get_monodepth2_resnet18_posenet_kitti_mono_stereo_640x192
(**kwargs)[source]¶ Monodepth2 PoseNet
- Parameters
backbone (string) – Pre-trained dilated backbone network type (default:’resnet18’).
-
gluoncv.model_zoo.
get_monodepth2posenet
(backbone='resnet18', pretrained_base=True, num_input_images=2, num_input_features=1, num_frames_to_predict_for=2, stride=1, root='~/.mxnet/models', ctx=cpu(0), pretrained=False, pretrained_model='kitti_stereo_640x192', **kwargs)[source]¶ Monodepth2
- Parameters
backbone (string) – Pre-trained dilated backbone network type (‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’ or ‘resnet152’).
pretrained_base (bool or str) – Refers to if the backbone is pretrained or not. If True, model weights of a model that was trained on ImageNet is loaded.
num_input_images (int) – The number of input sequences. 1 for depth encoder, larger than 1 for pose encoder. (Default: 2)
num_input_features (int) – The number of input feature maps from posenet encoder. (Default: 1)
num_frames_to_predict_for (int) – The number of output pose between frames; If None, it equals num_input_features - 1. (Default: 2)
stride (int) – The stride number for Conv in pose decoder. (Default: 1)
ctx (Context, default: CPU) – The context in which to load the pretrained weights.
root (str, default: '~/.mxnet/models') – Location for keeping the model parameters.
pretrained (bool or str, default: False) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_model (string, default: kitti_stereo_640x192) – The dataset that model pretrained on.
-
gluoncv.model_zoo.
get_nasnet
(repeat=6, penultimate_filters=4032, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper
- Parameters
repeat (int) – Number of cell repeats
penultimate_filters (int) – Number of filters in the penultimate layer of the network
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_psp
(dataset='pascal_voc', backbone='resnet50', pretrained=False, root='~/.mxnet/models', ctx=cpu(0), pretrained_base=True, **kwargs)[source]¶ Pyramid Scene Parsing Network :param dataset: The dataset that model pretrained on. (pascal_voc, ade20k) :type dataset: str, default pascal_voc :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
Examples
>>> model = get_fcn(dataset='pascal_voc', backbone='resnet50', pretrained=False) >>> print(model)
-
gluoncv.model_zoo.
get_psp_resnet101_ade
(**kwargs)[source]¶ Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_psp_resnet101_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_psp_resnet101_citys
(**kwargs)[source]¶ Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_psp_resnet101_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_psp_resnet101_coco
(**kwargs)[source]¶ Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_psp_resnet101_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_psp_resnet101_voc
(**kwargs)[source]¶ Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_psp_resnet101_voc(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_psp_resnet50_ade
(**kwargs)[source]¶ Pyramid Scene Parsing Network :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = get_psp_resnet50_ade(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
get_resnet
(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', use_se=False, **kwargs)[source]¶ ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
version (int) – Version of ResNet. Options are 1, 2.
num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
use_se (bool, default False) – Whether to use Squeeze-and-Excitation module
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_resnext
(num_layers, cardinality=32, bottleneck_width=4, use_se=False, deep_stem=False, avg_down=False, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ ResNext model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
num_layers (int) – Numbers of layers. Options are 50, 101.
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_se_resnet
(version, num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ SE_ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. SE_ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
version (int) – Version of ResNet. Options are 1, 2.
num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_ssd
(name, base_size, features, filters, sizes, ratios, steps, classes, dataset, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', anchor_generator=<class 'gluoncv.model_zoo.ssd.anchor.SSDAnchorGenerator'>, **kwargs)[source]¶ Get SSD models.
- Parameters
name (str or None) – Model name, if None is used, you must specify features to be a HybridBlock.
base_size (int) – Base image size for training, this is fixed once training is assigned. A fixed base size still allows you to have variable input size during test.
features (iterable of str or HybridBlock) – List of network internal output names, in order to specify which layers are used for predicting bbox values. If name is None, features must be a HybridBlock which generate multiple outputs for prediction.
filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.
sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have
sizes = [30, 60, 90]
, and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
steps (list of int) – Step size of anchor boxes in each output layer.
classes (iterable of str) – Names of categories.
dataset (str) – Name of dataset. This is used to identify model name because models trained on different datasets are going to be very different.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
get_vgg
(num_layers, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
num_layers (int) – Number of layers for the variant of densenet. Options are 11, 13, 16, 19.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
gluoncv.model_zoo.
get_vgg_atrous_extractor
(num_layers, im_size, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Get VGG atrous feature extractor networks.
- Parameters
num_layers (int) – VGG types, can be 11,13,16,19.
im_size (int) – VGG detection input size, can be 300, 512.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (mx.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
- Returns
The returned network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
get_xcetption
(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Xception model from
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_xcetption_71
(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Xception model from
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
get_yolov3
(name, stages, filters, anchors, strides, classes, dataset, pretrained=False, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ Get YOLOV3 models. :param name: Model name, if None is used, you must specify features to be a HybridBlock. :type name: str or None :param stages: List of network internal output names, in order to specify which layers are
used for predicting bbox values. If name is None, features must be a HybridBlock which generate multiple outputs for prediction.
- Parameters
filters (iterable of float or None) – List of convolution layer channels which is going to be appended to the base network feature extractor. If name is None, this is ignored.
sizes (iterable fo float) – Sizes of anchor boxes, this should be a list of floats, in incremental order. The length of sizes must be len(layers) + 1. For example, a two stage SSD model can have
sizes = [30, 60, 90]
, and it converts to [30, 60] and [60, 90] for the two stages, respectively. For more details, please refer to original paper.ratios (iterable of list) – Aspect ratios of anchors in each output layer. Its length must be equals to the number of SSD output layers.
steps (list of int) – Step size of anchor boxes in each output layer.
classes (iterable of str) – Names of categories.
dataset (str) – Name of dataset. This is used to identify model name because models trained on different datasets are going to be very different.
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (mxnet.Context) – Context such as mx.cpu(), mx.gpu(0).
root (str) – Model weights storing path.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A YOLOV3 detection network.
- Return type
-
gluoncv.model_zoo.
googlenet
(classes=1000, pretrained=False, pretrained_base=True, ctx=cpu(0), dropout_ratio=0.4, aux_logits=False, root='~/.mxnet/models', partial_bn=False, **kwargs)[source]¶ GoogleNet model from “Going Deeper with Convolutions” paper. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
partial_bn (bool, default False) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
gpu_iou
(bbox_a_tensor, bbox_b_tensor)[source]¶ - Parameters
bbox_a_tensor –
bbox_b_tensor –
-
gluoncv.model_zoo.
hrnet_w18_small_v1_c
(**kwargs)[source]¶ hhrnet_w18_small_v1 for Imagenet classification
-
gluoncv.model_zoo.
hrnet_w18_small_v1_s
(**kwargs)[source]¶ hrnet_w18_small_v1 for cityscapes segmentation
-
gluoncv.model_zoo.
hrnet_w18_small_v2_c
(**kwargs)[source]¶ hhrnet_w18_small_v2 for Imagenet classification
-
gluoncv.model_zoo.
hrnet_w18_small_v2_s
(**kwargs)[source]¶ hrnet_w18_small_v2 for cityscapes segmentation
-
gluoncv.model_zoo.
i3d_inceptionv1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inception v1 model trained on Kinetics400 dataset from “Going Deeper with Convolutions” paper.
Inflated 3D model (I3D) from “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset” paper.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_inceptionv3_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inception v3 model trained on Kinetics400 dataset from “Rethinking the Inception Architecture for Computer Vision” paper.
Inflated 3D model (I3D) from “Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset” paper.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_nl10_resnet101_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet101 backbone and 10 non-local blocks trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_nl10_resnet50_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone and 10 non-local blocks trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_nl5_resnet101_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet101 backbone and 5 non-local blocks trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_nl5_resnet50_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone and 5 non-local blocks trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_resnet101_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_resnet50_v1_custom
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, use_kinetics_pretrain=True, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone. Customized for users’s own dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
use_kinetics_pretrain (bool.) – Whether to load Kinetics-400 pre-trained model weights.
-
gluoncv.model_zoo.
i3d_resnet50_v1_hmdb51
(nclass=51, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, use_kinetics_pretrain=True, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone trained on HMDB51 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_resnet50_v1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, bn_frozen=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_resnet50_v1_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
i3d_resnet50_v1_ucf101
(nclass=101, pretrained=False, pretrained_base=True, ctx=cpu(0), root='~/.mxnet/models', use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, use_kinetics_pretrain=True, feat_ext=False, **kwargs)[source]¶ Inflated 3D model (I3D) with ResNet50 backbone trained on UCF101 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
bn_frozen (bool.) – Whether to freeze weight and bias of BN layers.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
inception_v3
(pretrained=False, ctx=cpu(0), root='~/.mxnet/models', partial_bn=False, **kwargs)[source]¶ Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
partial_bn (bool, default False) – Freeze all batch normalization layers during training except the first layer.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
inceptionv1_hmdb51
(nclass=51, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV1 model trained on HMDB51 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv1_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV1 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv1_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV1 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv1_ucf101
(nclass=101, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV1 model trained on UCF101 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv3_hmdb51
(nclass=51, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV3 model trained on HMDB51 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv3_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV3 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv3_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV3 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
inceptionv3_ucf101
(nclass=101, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=True, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ InceptionV3 model trained on UCF101 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
mask_rcnn_fpn_resnet101_v1d_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_fpn_resnet101_v1d_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_fpn_resnet18_v1b_coco
(pretrained=False, pretrained_base=True, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_fpn_resnet18_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_fpn_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_resnet50_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_fpn_syncbn_mobilenet1_0_coco
(pretrained=False, pretrained_base=True, num_devices=0, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_fpn_syncbn_mobilenet1_0_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_fpn_syncbn_resnet18_v1b_coco
(pretrained=False, pretrained_base=True, num_devices=0, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
num_devices (int, default is 0) – Number of devices for sync batch norm layer. if less than 1, use all devices available.
rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_fpn_syncbn_resnet18_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_resnet101_v1d_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is Ture, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_resnet101_v1d_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_resnet18_v1b_coco
(pretrained=False, pretrained_base=True, rcnn_max_dets=1000, rpn_test_pre_nms=6000, rpn_test_post_nms=1000, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
rcnn_max_dets (int, default is 1000) – Number of rois to retain in RCNN.
rpn_test_pre_nms (int, default is 6000) – Filter top proposals before NMS in testing of RPN.
rpn_test_post_nms (int, default is 300) – Return top proposal results after NMS in testing of RPN.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_resnet18_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mask_rcnn_resnet50_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ Mask RCNN model from the paper “He, K., Gkioxari, G., Doll&ar, P., & Girshick, R. (2017). Mask R-CNN”
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
Examples
>>> model = mask_rcnn_resnet50_v1b_coco(pretrained=True) >>> print(model)
-
gluoncv.model_zoo.
mobilenet0_25
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.25.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet0_5
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.5.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet0_75
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.75.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet1_0
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 1.0.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet_v2_0_25
(**kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet_v2_0_5
(**kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet_v2_0_75
(**kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
mobilenet_v2_1_0
(**kwargs)[source]¶ MobileNetV2 model from the `”Inverted Residuals and Linear Bottlenecks:
Mobile Networks for Classification, Detection and Segmentation”
<https://arxiv.org/abs/1801.04381>`_ paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
nasnet_4_1056
(**kwargs)[source]¶ NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper
- Parameters
repeat (int) – Number of cell repeats
penultimate_filters (int) – Number of filters in the penultimate layer of the network
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
nasnet_5_1538
(**kwargs)[source]¶ NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper
- Parameters
repeat (int) – Number of cell repeats
penultimate_filters (int) – Number of filters in the penultimate layer of the network
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
nasnet_6_4032
(**kwargs)[source]¶ NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper
- Parameters
repeat (int) – Number of cell repeats
penultimate_filters (int) – Number of filters in the penultimate layer of the network
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
nasnet_7_1920
(**kwargs)[source]¶ NASNet A model from “Learning Transferable Architectures for Scalable Image Recognition” paper
- Parameters
repeat (int) – Number of cell repeats
penultimate_filters (int) – Number of filters in the penultimate layer of the network
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
nms_fallback
(boxes, thresh)[source]¶ Perform non-maximal suppression and return the indices :param boxes: :type boxes: [[x, y, xmax, ymax, score]] :param Returns kept box indices: :param ——-:
-
gluoncv.model_zoo.
p3d_resnet101_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ The Pseudo 3D network (P3D) with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
p3d_resnet50_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ The Pseudo 3D network (P3D) with ResNet50 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
pretrained_model_list
()[source]¶ Get list of model which has pretrained weights available.
-
gluoncv.model_zoo.
r2plus1d_resnet101_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ R2Plus1D with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
r2plus1d_resnet152_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ R2Plus1D with ResNet152 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
r2plus1d_resnet18_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ R2Plus1D with ResNet18 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
r2plus1d_resnet34_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ R2Plus1D with ResNet34 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
r2plus1d_resnet50_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, root='~/.mxnet/models', num_segments=1, num_crop=1, feat_ext=False, ctx=cpu(0), **kwargs)[source]¶ R2Plus1D with ResNet50 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
residualattentionnet128
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet164
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet200
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet236
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet452
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet56
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32, 224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
residualattentionnet92
(**kwargs)[source]¶ AttentionModel model from “Residual Attention Network for Image Classification” paper.
- Parameters
input_size (int) – Input size of net. Options are 32,224.
num_layers (int) – Numbers of layers. Options are 56, 92, 128, 164, 200, 236, 452.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest101
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-101 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest14
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-14 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest200
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-200 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest26
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-26 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest269
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-269 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnest50
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNeSt-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNeSt, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v1
(**kwargs)[source]¶ ResNet-101 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v1b
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-101 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet101_v1b_gn
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-50 GroupNorm model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet101_v1b_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet101 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet101_v1b_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet101 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet101_v1c
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1c-101 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v1d
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1d-101 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v1e
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1e-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v1s
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1s-101 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet101_v2
(**kwargs)[source]¶ ResNet-101 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v1
(**kwargs)[source]¶ ResNet-152 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v1b
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-152 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet152_v1b_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet152 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet152_v1b_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet152 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet152_v1c
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1c-152 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v1d
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1d-152 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v1e
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1e-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v1s
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1s-152 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet152_v2
(**kwargs)[source]¶ ResNet-152 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet18_v1
(**kwargs)[source]¶ ResNet-18 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet18_v1b
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-18 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet18_v1b_custom
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, use_kinetics_pretrain=True, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet18 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet18_v1b_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet18 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet18_v1b_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet18 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet18_v2
(**kwargs)[source]¶ ResNet-18 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet34_v1
(**kwargs)[source]¶ ResNet-34 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet34_v1b
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-34 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet34_v1b_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet34 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet34_v1b_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet34 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet34_v2
(**kwargs)[source]¶ ResNet-34 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v1
(**kwargs)[source]¶ ResNet-50 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v1b
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet50_v1b_custom
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), use_kinetics_pretrain=True, **kwargs)[source]¶ ResNet50 model customized for any dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
use_kinetics_pretrain (bool, default True.) – Whether to load pretrained weights on Kinetics400 dataset as model initialization.
-
gluoncv.model_zoo.
resnet50_v1b_gn
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1b-50 GroupNorm model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
last_gamma (bool, default False) – Whether to initialize the gamma of the last BatchNorm layer in each bottleneck to zero.
use_global_stats (bool, default False) – Whether forcing BatchNorm to use global statistics instead of minibatch statistics; optionally set to True if finetuning using ImageNet classification pretrained models.
-
gluoncv.model_zoo.
resnet50_v1b_hmdb51
(nclass=51, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet50 model trained on HMDB51 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet50_v1b_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet50 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet50_v1b_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet50 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet50_v1b_ucf101
(nclass=101, pretrained=False, pretrained_base=True, use_tsn=False, partial_bn=False, num_segments=1, num_crop=1, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ ResNet50 model trained on UCF101 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
-
gluoncv.model_zoo.
resnet50_v1c
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1c-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v1d
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1d-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v1e
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1e-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v1s
(pretrained=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ Constructs a ResNetV1s-50 model.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
dilated (bool, default False) – Whether to apply dilation strategy to ResNetV1b, yielding a stride 8 model.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
). Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnet50_v2
(**kwargs)[source]¶ ResNet-50 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnext101_32x4d
(**kwargs)[source]¶ ResNext101 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnext101_64x4d
(**kwargs)[source]¶ ResNext101 64x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnext101e_64x4d
(**kwargs)[source]¶ ResNext101e 64x4d model modified from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
resnext50_32x4d
(**kwargs)[source]¶ ResNext50 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet101_v1
(**kwargs)[source]¶ SE-ResNet-101 V1 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet101_v2
(**kwargs)[source]¶ SE-ResNet-101 V2 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet152_v1
(**kwargs)[source]¶ SE-ResNet-152 V1 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet152_v2
(**kwargs)[source]¶ SE-ResNet-152 V2 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet18_v1
(**kwargs)[source]¶ SE-ResNet-18 V1 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet18_v2
(**kwargs)[source]¶ SE-ResNet-18 V2 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet34_v1
(**kwargs)[source]¶ SE-ResNet-34 V1 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet34_v2
(**kwargs)[source]¶ SE-ResNet-34 V2 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet50_v1
(**kwargs)[source]¶ SE-ResNet-50 V1 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnet50_v2
(**kwargs)[source]¶ SE-ResNet-50 V2 model from “Squeeze-and-Excitation Networks” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnext101_32x4d
(**kwargs)[source]¶ SE-ResNext101 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnext101_64x4d
(**kwargs)[source]¶ SE-ResNext101 64x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnext101e_64x4d
(**kwargs)[source]¶ SE-ResNext101e 64x4d model modified from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
se_resnext50_32x4d
(**kwargs)[source]¶ SE-ResNext50 32x4d model from “Aggregated Residual Transformations for Deep Neural Network” paper.
- Parameters
cardinality (int) – Number of groups
bottleneck_width (int) – Width of bottleneck block
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '~/.mxnet/models') – Location for keeping the model parameters.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
-
gluoncv.model_zoo.
siamrpn_alexnet_v2_otb15
(**kwargs)[source]¶ Alexnet backbone model from `”High Performance Visual Tracking with Siamese Region Proposal Network
Object tracking”
<http://openaccess.thecvf.com/content_cvpr_2018/papers/ Li_High_Performance_Visual_CVPR_2018_paper.pdf>`_ paper.
-
gluoncv.model_zoo.
simple_pose_resnet101_v1b
(**kwargs)[source]¶ ResNet-101 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet101_v1d
(**kwargs)[source]¶ ResNet-101-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet152_v1b
(**kwargs)[source]¶ ResNet-152 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet152_v1d
(**kwargs)[source]¶ ResNet-152-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet18_v1b
(**kwargs)[source]¶ ResNet-18 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet50_v1b
(**kwargs)[source]¶ ResNet-50 backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
simple_pose_resnet50_v1d
(**kwargs)[source]¶ ResNet-50-d backbone model from “Simple Baselines for Human Pose Estimation and Tracking” paper. :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
slowfast_16x8_resnet101_50_50_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 16x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset, but the temporal head is initialized with ResNet50 structure (3, 4, 6, 3).
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
slowfast_16x8_resnet101_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 16x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
slowfast_4x16_resnet101_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 4x16 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
slowfast_4x16_resnet50_custom
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, use_kinetics_pretrain=True, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 4x16 networks (SlowFast) with ResNet50 backbone. Customized for users’s own dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
use_kinetics_pretrain (bool.) – Whether to load Kinetics-400 pre-trained model weights.
-
gluoncv.model_zoo.
slowfast_4x16_resnet50_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 4x16 networks (SlowFast) with ResNet50 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
slowfast_8x8_resnet101_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 8x8 networks (SlowFast) with ResNet101 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
slowfast_8x8_resnet50_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, partial_bn=False, feat_ext=False, root='~/.mxnet/models', ctx=cpu(0), **kwargs)[source]¶ SlowFast 8x8 networks (SlowFast) with ResNet50 backbone trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
partial_bn (bool, default False.) – Freeze all batch normalization layers during training except the first layer.
feat_ext (bool.) – Whether to extract features before dense classification layer or do a complete forward pass.
-
gluoncv.model_zoo.
squeezenet1_0
(**kwargs)[source]¶ SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
squeezenet1_1
(**kwargs)[source]¶ SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
ssd_300_mobilenet0_25_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with mobilenet0.25 base networks for COCO.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_300_mobilenet0_25_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with mobilenet0.25 300 base network for custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_300_mobilenet0_25_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_300_mobilenet0_25_custom(classes=['foo', 'bar'], transfer='voc')
-
gluoncv.model_zoo.
ssd_300_mobilenet0_25_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with mobilenet0.25 base networks.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_300_resnet34_v1b_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1b 34 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_300_resnet34_v1b_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with ResNet v1b 34 layers for custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_300_resnet34_v1b_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_300_resnet34_v1b_custom(classes=['foo', 'bar'], transfer='coco')
-
gluoncv.model_zoo.
ssd_300_resnet34_v1b_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1b 34 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_300_vgg16_atrous_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with VGG16 atrous 300x300 base network for COCO.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_300_vgg16_atrous_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with VGG16 atrous 300x300 base network for COCO.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_300_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_300_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
-
gluoncv.model_zoo.
ssd_300_vgg16_atrous_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with VGG16 atrous 300x300 base network for Pascal VOC.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_mobilenet1_0_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with mobilenet1.0 base networks for COCO.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_mobilenet1_0_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with mobilenet1.0 512 base network for custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_512_mobilenet1_0_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_512_mobilenet1_0_custom(classes=['foo', 'bar'], transfer='voc')
-
gluoncv.model_zoo.
ssd_512_mobilenet1_0_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with mobilenet1.0 base networks.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet101_v2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v2 101 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet152_v2_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v2 152 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet18_v1_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1 18 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet18_v1_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with ResNet18 v1 512 base network for COCO.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_512_resnet18_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_512_resnet18_v1_custom(classes=['foo', 'bar'], transfer='voc')
-
gluoncv.model_zoo.
ssd_512_resnet18_v1_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1 18 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet50_v1_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1 50 layers for COCO.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_resnet50_v1_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with ResNet50 v1 512 base network for custom dataset.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_512_resnet50_v1_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_512_resnet50_v1_custom(classes=['foo', 'bar'], transfer='voc')
-
gluoncv.model_zoo.
ssd_512_resnet50_v1_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with ResNet v1 50 layers.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_vgg16_atrous_coco
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with VGG16 atrous layers for COCO.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
ssd_512_vgg16_atrous_custom
(classes, pretrained_base=True, pretrained=False, transfer=None, **kwargs)[source]¶ SSD architecture with VGG16 atrous 300x300 base network for COCO.
- Parameters
classes (iterable of str) – Names of custom foreground classes. len(classes) is the number of foreground classes.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
transfer (str or None) – If not None, will try to reuse pre-trained weights from SSD networks trained on other datasets.
- Returns
A SSD detection network.
- Return type
Example
>>> net = ssd_512_vgg16_atrous_custom(classes=['a', 'b', 'c'], pretrained_base=True) >>> net = ssd_512_vgg16_atrous_custom(classes=['foo', 'bar'], transfer='coco')
-
gluoncv.model_zoo.
ssd_512_vgg16_atrous_voc
(pretrained=False, pretrained_base=True, **kwargs)[source]¶ SSD architecture with VGG16 atrous 512x512 base network.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True) – Load pretrained base network, the extra layers are randomized.
- Returns
A SSD detection network.
- Return type
-
gluoncv.model_zoo.
vgg11
(**kwargs)[source]¶ VGG-11 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg11_bn
(**kwargs)[source]¶ VGG-11 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg13
(**kwargs)[source]¶ VGG-13 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg13_bn
(**kwargs)[source]¶ VGG-13 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg16
(**kwargs)[source]¶ VGG-16 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg16_atrous_300
(**kwargs)[source]¶ Get VGG atrous 16 layer 300 in_size feature extractor networks.
-
gluoncv.model_zoo.
vgg16_atrous_512
(**kwargs)[source]¶ Get VGG atrous 16 layer 512 in_size feature extractor networks.
-
gluoncv.model_zoo.
vgg16_bn
(**kwargs)[source]¶ VGG-16 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg16_hmdb51
(nclass=51, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ VGG16 model trained on HMDB51 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
-
gluoncv.model_zoo.
vgg16_kinetics400
(nclass=400, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ VGG16 model trained on Kinetics400 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
-
gluoncv.model_zoo.
vgg16_sthsthv2
(nclass=174, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ VGG16 model trained on Something-Something-V2 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
-
gluoncv.model_zoo.
vgg16_ucf101
(nclass=101, pretrained=False, pretrained_base=True, use_tsn=False, num_segments=1, num_crop=1, ctx=cpu(0), root='~/.mxnet/models', **kwargs)[source]¶ VGG16 model trained on UCF101 dataset.
- Parameters
nclass (int.) – Number of categories in the dataset.
pretrained (bool or str.) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
pretrained_base (bool or str, optional, default is True.) – Load pretrained base network, the extra layers are randomized. Note that if pretrained is True, this has no effect.
ctx (Context, default CPU.) – The context in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
num_segments (int, default is 1.) – Number of segments used to evenly divide a video.
num_crop (int, default is 1.) – Number of crops used during evaluation, choices are 1, 3 or 10.
-
gluoncv.model_zoo.
vgg19
(**kwargs)[source]¶ VGG-19 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
vgg19_bn
(**kwargs)[source]¶ VGG-19 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
gluoncv.model_zoo.
yolo3_darknet53_coco
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with darknet53 base network on COCO dataset. :param pretrained_base: Whether fetch and load pretrained weights for base network. :type pretrained_base: boolean :param pretrained: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_darknet53_custom
(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with darknet53 base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other
datasets.
- Parameters
pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_darknet53_voc
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with darknet53 base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet0_25_coco
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet0.25 base network on COCO dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet0_25_custom
(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet0.25 base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other
datasets.
- Parameters
pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet0_25_voc
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet0.25 base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet1_0_coco
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet base network on COCO dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet1_0_custom
(classes, transfer=None, pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet base network on custom dataset. :param classes: Names of custom foreground classes. len(classes) is the number of foreground classes. :type classes: iterable of str :param transfer: If not None, will try to reuse pre-trained weights from yolo networks trained on other
datasets.
- Parameters
pretrained_base (boolean) – Whether fetch and load pretrained weights for base network.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock
-
gluoncv.model_zoo.
yolo3_mobilenet1_0_voc
(pretrained_base=True, pretrained=False, norm_layer=<class 'mxnet.gluon.nn.basic_layers.BatchNorm'>, norm_kwargs=None, **kwargs)[source]¶ YOLO3 multi-scale with mobilenet base network on VOC dataset. :param pretrained_base: Boolean value controls whether to load the default pretrained weights for model.
String value represents the hashtag for a certain version of pretrained weights.
- Parameters
pretrained (bool or str) – Boolean value controls whether to load the default pretrained weights for model. String value represents the hashtag for a certain version of pretrained weights.
norm_layer (object) – Normalization layer used (default:
mxnet.gluon.nn.BatchNorm
) Can bemxnet.gluon.nn.BatchNorm
ormxnet.gluon.contrib.nn.SyncBatchNorm
.norm_kwargs (dict) – Additional norm_layer arguments, for example num_devices=4 for
mxnet.gluon.contrib.nn.SyncBatchNorm
.
- Returns
Fully hybrid yolo3 network.
- Return type
mxnet.gluon.HybridBlock