gluoncv.nn

Neural Network Components.

Hint

Not every component listed here is HybridBlock, which means some of them are not hybridizable. However, we are trying our best to make sure components required during inference are hybridizable so the entire network can be exported and run in other languages.

For example, encoders are usually non-hybridizable but are only required during training. In contrast, decoders are mostly `HybridBlock`s.

Bounding Box

Blocks that apply bounding box related functions.

BBoxCornerToCenter

Convert corner boxes to center boxes.

BBoxCenterToCorner

Convert center boxes to corner boxes.

BBoxSplit

Split bounding boxes into 4 columns.

BBoxArea

Calculate the area of bounding boxes.

Coders

Encoders are used to encode training targets before we apply loss functions. Decoders are used to restore predicted values by inverting the operations done in encoders. They often come as a pair in order to make the results consistent.

NormalizedBoxCenterEncoder

Encode bounding boxes training target with normalized center offsets.

NormalizedBoxCenterDecoder

Decode bounding boxes training target with normalized center offsets.

MultiClassEncoder

Encode classification training target given matching results.

MultiClassDecoder

Decode classification results.

MultiPerClassDecoder

Decode classification results.

SigmoidClassEncoder

Encode class prediction labels for SigmoidCrossEntropy Loss.

Feature

Feature layers are components that either extract partial networks as feature extractor or extend them with new layers.

FeatureExtractor

Feature extractor.

FeatureExpander

Feature extractor with additional layers to append.

Matchers

Matchers are often used by object detection tasks whose target is to find the matchings between anchor boxes(very popular in object detection) and ground truths.

CompositeMatcher

A Matcher that combines multiple strategies.

BipartiteMatcher

A Matcher implementing bipartite matching strategy.

MaximumMatcher

A Matcher implementing maximum matching strategy.

Predictors

Predictors are common neural network components which are specifically used to predict values. Depending on the purpose, it may vary from Convolution or Fully Connected.

ConvPredictor

Convolutional predictor.

FCPredictor

Fully connected predictor.

Samplers

Samples are often used after matching layers which is to determine positive/negative/ignored samples.

For example, a NaiveSampler simply returns all matched samples as positive, and all un-matched samples as negative.

This behavior is sometimes prone to vulnerability because training objective is not balanced. Please see OHEMSampler and QuotaSampler for more advanced sampling strategies.

NaiveSampler

A naive sampler that take all existing matching results.

OHEMSampler

A sampler implementing Online Hard-negative mining.

QuotaSampler

Sampler that handles limited quota for positive and negative samples.

API Reference

Bounding boxes operators

class gluoncv.nn.bbox.BBoxArea(axis=- 1, fmt='corner', **kwargs)[source]

Calculate the area of bounding boxes.

Parameters
  • fmt (str, default is corner) – Bounding box format, can be {‘center’, ‘corner’}. ‘center’: {x, y, width, height} ‘corner’: {xmin, ymin, xmax, ymax}

  • axis (int, default is -1) – Effective axis of the bounding box. Default is -1(the last dimension).

Returns

Return type

A BxNx1 NDArray

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.bbox.BBoxBatchIOU(axis=- 1, fmt='corner', offset=0, eps=1e-15, **kwargs)[source]

Batch Bounding Box IOU.

Parameters
  • axis (int) – On which axis is the length-4 bounding box dimension.

  • fmt (str) – BBox encoding format, can be ‘corner’ or ‘center’. ‘corner’: (xmin, ymin, xmax, ymax) ‘center’: (center_x, center_y, width, height)

  • offset (float, default is 0) – Offset is used if +1 is desired for computing width and height, otherwise use 0.

  • eps (float, default is 1e-15) – Very small number to avoid division by 0.

hybrid_forward(F, a, b)[source]

Compute IOU for each batch

Parameters
  • a (mxnet.nd.NDArray or mxnet.sym.Symbol) – (B, N, 4) first input.

  • b (mxnet.nd.NDArray or mxnet.sym.Symbol) – (B, M, 4) second input.

Returns

(B, N, M) array of IOUs.

Return type

mxnet.nd.NDArray or mxnet.sym.Symbol

class gluoncv.nn.bbox.BBoxCenterToCorner(axis=- 1, split=False)[source]

Convert center boxes to corner boxes. Corner boxes are encoded as (xmin, ymin, xmax, ymax) Center boxes are encoded as (center_x, center_y, width, height)

Parameters
  • split (bool) – Whether split boxes to individual elements after processing.

  • axis (int, default is -1) – Effective axis of the bounding box. Default is -1(the last dimension).

Returns

Return type

A BxNx4 NDArray if split is False, or 4 BxNx1 NDArray if split is True.

hybrid_forward(F, x)[source]

Hybrid forward

class gluoncv.nn.bbox.BBoxClipToImage(**kwargs)[source]

Clip bounding box coordinates to image boundaries. If multiple images are supplied and padded, must have additional inputs of accurate image shape.

hybrid_forward(F, x, img)[source]

If images are padded, must have additional inputs for clipping

Parameters
  • x ((B, N, 4) Bounding box coordinates.) –

  • img ((B, C, H, W) Image tensor.) –

Returns

Return type

(B, N, 4) Bounding box coordinates.

class gluoncv.nn.bbox.BBoxCornerToCenter(axis=- 1, split=False)[source]

Convert corner boxes to center boxes. Corner boxes are encoded as (xmin, ymin, xmax, ymax) Center boxes are encoded as (center_x, center_y, width, height)

Parameters
  • split (bool) – Whether split boxes to individual elements after processing.

  • axis (int, default is -1) – Effective axis of the bounding box. Default is -1(the last dimension).

Returns

Return type

A BxNx4 NDArray if split is False, or 4 BxNx1 NDArray if split is True

hybrid_forward(F, x)[source]

Hybrid forward

class gluoncv.nn.bbox.BBoxSplit(axis, squeeze_axis=False, **kwargs)[source]

Split bounding boxes into 4 columns.

Parameters
  • axis (int, default is -1) – On which axis to split the bounding box. Default is -1(the last dimension).

  • squeeze_axis (boolean, default is False) – If true, Removes the axis with length 1 from the shapes of the output arrays. Note that setting squeeze_axis to true removes axis with length 1 only along the axis which it is split. Also squeeze_axis can be set to true only if input.shape[axis] == num_outputs.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.bbox.NumPyBBoxCornerToCenter(axis=- 1, split=False)[source]

Convert corner boxes to center boxes using numpy. Corner boxes are encoded as (xmin, ymin, xmax, ymax) Center boxes are encoded as (center_x, center_y, width, height)

Parameters
  • split (bool) – Whether split boxes to individual elements after processing.

  • axis (int, default is -1) – Effective axis of the bounding box. Default is -1(the last dimension).

Returns

Return type

A BxNx4 NDArray if split is False, or 4 BxNx1 NDArray if split is True

Encoder and Decoder functions. Encoders are used during training, which assign training targets. Decoders are used during testing/validation, which convert predictions back to normal boxes, etc.

class gluoncv.nn.coder.CenterNetDecoder(topk=100, scale=4.0)[source]

Decorder for centernet.

Parameters
  • topk (int) – Only keep topk results.

  • scale (float, default is 4.0) – Downsampling scale for the network.

hybrid_forward(F, x, wh, reg)[source]

Forward of decoder

class gluoncv.nn.coder.MultiClassDecoder(axis=- 1, thresh=0.01)[source]

Decode classification results.

This decoder must work with MultiClassEncoder to reconstruct valid labels. The decoder expect results are after logits, e.g. Softmax.

Parameters
  • axis (int) – Axis of class-wise results.

  • thresh (float) – Confidence threshold for the post-softmax scores. Scores less than thresh are marked with 0, corresponding cls_id is marked with invalid class id -1.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.coder.MultiClassEncoder(ignore_label=- 1)[source]

Encode classification training target given matching results.

This encoder will assign training target of matched bounding boxes to ground-truth label + 1 and negative samples with label 0. Ignored samples will be assigned with ignore_label, whose default is -1.

Parameters

ignore_label (float) – Assigned to un-matched samples, they are neither positive or negative during training, and should be excluded in loss function. Default is -1.

hybrid_forward(F, samples, matches, refs)[source]

HybridBlock, handle multi batch correctly

Parameters
  • samples ((B, N), value +1 (positive), -1 (negative), 0 (ignore)) –

  • matches ((B, N), value range [0, M)) –

  • refs ((B, M), value range [0, num_fg_class), excluding background) –

Returns

targets

Return type

(B, N), value range [0, num_fg_class + 1), including background

class gluoncv.nn.coder.MultiPerClassDecoder(num_class, axis=- 1, thresh=0.01)[source]

Decode classification results.

This decoder must work with MultiClassEncoder to reconstruct valid labels. The decoder expect results are after logits, e.g. Softmax. This version is different from gluoncv.nn.coder.MultiClassDecoder with the following changes:

For each position(anchor boxes), each foreground class can have their own results, rather than enforced to be the best one. For example, for a 5-class prediction with background(totaling 6 class), say (0.5, 0.1, 0.2, 0.1, 0.05, 0.05) as (bg, apple, orange, peach, grape, melon), MultiClassDecoder produce only one class id and score, that is (orange-0.2). MultiPerClassDecoder produce 5 results individually: (apple-0.1, orange-0.2, peach-0.1, grape-0.05, melon-0.05).

Parameters
  • num_class (int) – Number of classes including background.

  • axis (int) – Axis of class-wise results.

  • thresh (float) – Confidence threshold for the post-softmax scores. Scores less than thresh are marked with 0, corresponding cls_id is marked with invalid class id -1.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.coder.NormalizedBoxCenterDecoder(stds=(0.1, 0.1, 0.2, 0.2), convert_anchor=False, clip=None, minimal_opset=False)[source]

Decode bounding boxes training target with normalized center offsets. This decoder must cooperate with NormalizedBoxCenterEncoder of same stds in order to get properly reconstructed bounding boxes.

Returned bounding boxes are using corner type: x_{min}, y_{min}, x_{max}, y_{max}.

Parameters
  • stds (array-like of size 4) – Std value to be divided from encoded values, default is (0.1, 0.1, 0.2, 0.2).

  • clip (float, default is None) – If given, bounding box target will be clipped to this value.

  • convert_anchor (boolean, default is False) – Whether to convert anchor from corner to center format.

  • minimal_opset (bool) – We sometimes add special operators to accelerate training/inference, however, for exporting to third party compilers we want to utilize most widely used operators. If minimal_opset is True, the network will use a minimal set of operators good for e.g., TVM.

hybrid_forward(F, x, anchors)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.coder.NormalizedBoxCenterEncoder(stds=(0.1, 0.1, 0.2, 0.2), means=(0.0, 0.0, 0.0, 0.0), **kwargs)[source]

Encode bounding boxes training target with normalized center offsets.

Input bounding boxes are using corner type: x_{min}, y_{min}, x_{max}, y_{max}.

Parameters
  • stds (array-like of size 4) – Std value to be divided from encoded values, default is (0.1, 0.1, 0.2, 0.2).

  • means (array-like of size 4) – Mean value to be subtracted from encoded values, default is (0., 0., 0., 0.).

hybrid_forward(F, samples, matches, anchors, refs)[source]

Not HybridBlock due to use of matches.shape

Parameters
  • samples ((B, N) value +1 (positive), -1 (negative), 0 (ignore)) –

  • matches ((B, N) value range [0, M)) –

  • anchors ((B, N, 4) encoded in corner) –

  • refs ((B, M, 4) encoded in corner) –

Returns

  • targets ((B, N, 4) transform anchors to refs picked according to matches)

  • masks ((B, N, 4) only positive anchors has targets)

class gluoncv.nn.coder.NormalizedPerClassBoxCenterEncoder(num_class, max_pos=128, per_device_batch_size=1, stds=(0.1, 0.1, 0.2, 0.2), means=(0.0, 0.0, 0.0, 0.0))[source]

Encode bounding boxes training target with normalized center offsets.

Input bounding boxes are using corner type: x_{min}, y_{min}, x_{max}, y_{max}.

Parameters
  • max_pos (int, default is 128) – Upper bound of Number of positive samples.

  • per_device_batch_size (int, default is 1) – Per device batch size

  • stds (array-like of size 4) – Std value to be divided from encoded values, default is (0.1, 0.1, 0.2, 0.2).

  • means (array-like of size 4) – Mean value to be subtracted from encoded values, default is (0., 0., 0., 0.).

hybrid_forward(F, samples, matches, anchors, labels, refs, means=None, stds=None)[source]

Encode BBox One entry per category

Parameters
  • samples ((B, N) value +1 (positive), -1 (negative), 0 (ignore)) –

  • matches ((B, N) value range [0, M)) –

  • anchors ((B, N, 4) encoded in corner) –

  • labels ((B, N) value range [0, self._num_class), excluding background) –

  • refs ((B, M, 4) encoded in corner) –

Returns

  • targets ((B, N_pos, C, 4) transform anchors to refs picked according to matches)

  • masks ((B, N_pos, C, 4) only positive anchors of the correct class has targets)

  • indices ((B, N_pos) positive sample indices)

class gluoncv.nn.coder.NumPyNormalizedBoxCenterEncoder(stds=(0.1, 0.1, 0.2, 0.2), means=(0.0, 0.0, 0.0, 0.0))[source]

Encode bounding boxes training target with normalized center offsets using numpy.

Input bounding boxes are using corner type: x_{min}, y_{min}, x_{max}, y_{max}.

Parameters
  • stds (array-like of size 4) – Std value to be divided from encoded values, default is (0.1, 0.1, 0.2, 0.2).

  • means (array-like of size 4) – Mean value to be subtracted from encoded values, default is (0., 0., 0., 0.).

class gluoncv.nn.coder.SigmoidClassEncoder(**kwargs)[source]

Encode class prediction labels for SigmoidCrossEntropy Loss.

Feature extraction blocks. Feature or Multi-Feature extraction is a key component in object detection. Class predictor/Box predictor are usually applied on feature layer(s). A good feature extraction mechanism is critical to performance.

class gluoncv.nn.feature.FPNFeatureExpander(network, outputs, num_filters, use_1x1=True, use_upsample=True, use_elewadd=True, use_p6=False, p6_conv=True, no_bias=True, pretrained=False, norm_layer=None, norm_kwargs=None, ctx=cpu(0), inputs=('data'))[source]

Feature extractor with additional layers to append. This is specified for Feature Pyramid Network for Object Detection which implement Top-down pathway and lateral connections.

Parameters
  • network (str or HybridBlock or Symbol) – Logic chain: load from gluon.model_zoo.vision if network is string. Convert to Symbol if network is HybridBlock.

  • outputs (str or list of str) – The name of layers to be extracted as features

  • num_filters (list of int e.g. [256, 256, 256, 256]) – Number of filters to be appended.

  • use_1x1 (bool) – Whether to use 1x1 convolution

  • use_upsample (bool) – Whether to use upsample

  • use_elewadd (float) – Whether to use element-wise add operation

  • use_p6 (bool) – Whether use P6 stage, this is used for RPN experiments in ori paper

  • p6_conv (bool) – Whether to use convolution for P6 stage, if it is enabled, or just max pooling.

  • no_bias (bool) – Whether use bias for Convolution operation.

  • norm_layer (HybridBlock or SymbolBlock) – Type of normalization layer.

  • norm_kwargs (dict) – Arguments for normalization layer.

  • pretrained (bool) – Use pretrained parameters as in gluon.model_zoo if True.

  • ctx (Context) – The context, e.g. mxnet.cpu(), mxnet.gpu(0).

  • inputs (list of str) – Name of input variables to the network.

class gluoncv.nn.feature.FeatureExpander(network, outputs, num_filters, use_1x1_transition=True, use_bn=True, reduce_ratio=1.0, min_depth=128, global_pool=False, pretrained=False, ctx=cpu(0), inputs=('data'), **kwargs)[source]

Feature extractor with additional layers to append. This is very common in vision networks where extra branches are attached to backbone network.

Parameters
  • network (str or HybridBlock or Symbol) – Logic chain: load from gluoncv.model_zoo if network is string. Convert to Symbol if network is HybridBlock.

  • outputs (str or list of str) – The name of layers to be extracted as features

  • num_filters (list of int) – Number of filters to be appended.

  • use_1x1_transition (bool) – Whether to use 1x1 convolution between attached layers. It is effective reducing network size.

  • use_bn (bool) – Whether to use BatchNorm between attached layers.

  • reduce_ratio (float) – Channel reduction ratio of the transition layers.

  • min_depth (int) – Minimum channel number of transition layers.

  • global_pool (bool) – Whether to use global pooling as the last layer.

  • pretrained (bool) – Use pretrained parameters as in gluon.model_zoo if True.

  • ctx (Context) – The context, e.g. mxnet.cpu(), mxnet.gpu(0).

  • inputs (list of str) – Name of input variables to the network.

class gluoncv.nn.feature.FeatureExtractor(network, outputs, inputs=('data'), pretrained=False, ctx=cpu(0), **kwargs)[source]

Feature extractor.

Parameters
  • network (str or HybridBlock or Symbol) – Logic chain: load from gluoncv.model_zoo if network is string. Convert to Symbol if network is HybridBlock

  • outputs (str or list of str) – The name of layers to be extracted as features

  • inputs (list of str or list of Symbol) – The inputs of network.

  • pretrained (bool) – Use pretrained parameters as in gluon.model_zoo

  • ctx (Context) – The context, e.g. mxnet.cpu(), mxnet.gpu(0).

Predictor for classification/box prediction.

class gluoncv.nn.predictor.ConvPredictor(num_channel, kernel=(3, 3), pad=(1, 1), stride=(1, 1), activation=None, use_bias=True, in_channels=0, **kwargs)[source]

Convolutional predictor. Convolutional predictor is widely used in object-detection. It can be used to predict classification scores (1 channel per class) or box predictor, which is usually 4 channels per box. The output is of shape (N, num_channel, H, W).

Parameters
  • num_channel (int) – Number of conv channels.

  • kernel (tuple of (int, int), default (3, 3)) – Conv kernel size as (H, W).

  • pad (tuple of (int, int), default (1, 1)) – Conv padding size as (H, W).

  • stride (tuple of (int, int), default (1, 1)) – Conv stride size as (H, W).

  • activation (str, optional) – Optional activation after conv, e.g. ‘relu’.

  • use_bias (bool) – Use bias in convolution. It is not necessary if BatchNorm is followed.

  • in_channels (int, default is 0) – The number of input channels to this layer. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.predictor.FCPredictor(num_output, activation=None, use_bias=True, **kwargs)[source]

Fully connected predictor. Fully connected predictor is used to ignore spatial information and will output fixed-sized predictions.

Parameters
  • num_output (int) – Number of fully connected outputs.

  • activation (str, optional) – Optional activation after conv, e.g. ‘relu’.

  • use_bias (bool) – Use bias in convolution. It is not necessary if BatchNorm is followed.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

Matchers for target assignment. Matchers are commonly used in object-detection for anchor-groundtruth matching. The matching process is a prerequisite to training target assignment. Matching is usually not required during testing.

class gluoncv.nn.matcher.BipartiteMatcher(threshold=1e-12, is_ascend=False, eps=1e-12, share_max=True)[source]

A Matcher implementing bipartite matching strategy.

Parameters
  • threshold (float) – Threshold used to ignore invalid paddings

  • is_ascend (bool) – Whether sort matching order in ascending order. Default is False.

  • eps (float) – Epsilon for floating number comparison

  • share_max (bool, default is True) – The maximum overlap between anchor/gt is shared by multiple ground truths. We recommend Fast(er)-RCNN series to use True, while for SSD, it should defaults to False for better result.

hybrid_forward(F, x)[source]

BipartiteMatching

xNDArray or Symbol

IOU overlaps with shape (N, M), batching is supported.

class gluoncv.nn.matcher.CompositeMatcher(matchers)[source]

A Matcher that combines multiple strategies.

Parameters

matchers (list of Matcher) – Matcher is a Block/HybridBlock used to match two groups of boxes

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class gluoncv.nn.matcher.MaximumMatcher(threshold)[source]

A Matcher implementing maximum matching strategy.

Parameters

threshold (float) – Matching threshold.

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

Samplers for positive/negative/ignore sample selections. This module is used to select samples during training. Based on different strategies, we would like to choose different number of samples as positive, negative or ignore(don’t care). The purpose is to alleviate unbalanced training target in some circumstances. The output of sampler is an NDArray of the same shape as the matching results. Note: 1 for positive, -1 for negative, 0 for ignore.

class gluoncv.nn.sampler.NaiveSampler[source]

A naive sampler that take all existing matching results. There is no ignored sample in this case.

hybrid_forward(F, x)[source]

Hybrid forward

class gluoncv.nn.sampler.OHEMSampler(ratio, min_samples=0, thresh=0.5)[source]

A sampler implementing Online Hard-negative mining. As described in paper https://arxiv.org/abs/1604.03540.

Parameters
  • ratio (float) – Ratio of negative vs. positive samples. Values >= 1.0 is recommended.

  • min_samples (int, default 0) – Minimum samples to be selected regardless of positive samples. For example, if positive samples is 0, we sometimes still want some num_negative samples to be selected.

  • thresh (float, default 0.5) – IOU overlap threshold of selected negative samples. IOU must not exceed this threshold such that good matching anchors won’t be selected as negative samples.

forward(x, logits, ious)[source]

Forward

class gluoncv.nn.sampler.QuotaSampler(num_sample, pos_thresh, neg_thresh_high, neg_thresh_low=- inf, pos_ratio=0.5, neg_ratio=None, fill_negative=True)[source]

Sampler that handles limited quota for positive and negative samples.

Parameters
  • num_sample (int, default is 128) – Number of samples for RCNN targets.

  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.

  • neg_iou_thresh_high (float, default is 0.5) – Proposal whose IOU smaller than neg_iou_thresh_high and larger than neg_iou_thresh_low is regarded as negative samples. Proposals with IOU in between pos_iou_thresh and neg_iou_thresh are ignored.

  • neg_iou_thresh_low (float, default is 0.0) – See neg_iou_thresh_high.

  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.

  • neg_ratio (float or None) – neg_ratio defines how many negative samples (pos_ratio * num_sample) is to be sampled. If None is provided, it equals to 1 - pos_ratio.

  • fill_negative (bool) – If True, negative samples will fill the gap caused by insufficient positive samples. For example, if num_sample is 100, pos_ratio and neg_ratio are both 0.5. Available positive sample and negative samples are 10 and 10000, which are typical values. Now, the output positive samples is 10(intact), since it’s smaller than 50(100 * 0.5), the negative samples will fill the rest 40 slots. If fill_negative == False, the 40 slots is filled with -1(ignore).

forward(matches, ious)[source]

Quota Sampler

matchesNDArray or Symbol

Matching results, positive number for positive matching, -1 for not matched.

iousNDArray or Symbol

IOU overlaps with shape (N, M), batching is supported.

NDArray or Symbol

Sampling results with same shape as matches. 1 for positive, -1 for negative, 0 for ignore.

class gluoncv.nn.sampler.QuotaSamplerOp(num_sample, pos_thresh, neg_thresh_high=0.5, neg_thresh_low=- inf, pos_ratio=0.5, neg_ratio=None, fill_negative=True)[source]

Sampler that handles limited quota for positive and negative samples.

This is a custom Operator used inside HybridBlock.

Parameters
  • num_sample (int, default is 128) – Number of samples for RCNN targets.

  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.

  • neg_iou_thresh_high (float, default is 0.5) – Proposal whose IOU smaller than neg_iou_thresh_high and larger than neg_iou_thresh_low is regarded as negative samples. Proposals with IOU in between pos_iou_thresh and neg_iou_thresh are ignored.

  • neg_iou_thresh_low (float, default is 0.0) – See neg_iou_thresh_high.

  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.

  • neg_ratio (float or None) – neg_ratio defines how many negative samples (pos_ratio * num_sample) is to be sampled. If None is provided, it equals to 1 - pos_ratio.

  • fill_negative (bool) – If True, negative samples will fill the gap caused by insufficient positive samples. For example, if num_sample is 100, pos_ratio and neg_ratio are both 0.5. Available positive sample and negative samples are 10 and 10000, which are typical values. Now, the output positive samples is 10(intact), since it’s smaller than 50(100 * 0.5), the negative samples will fill the rest 40 slots. If fill_negative == False, the 40 slots is filled with -1(ignore).

backward(req, out_grad, in_data, out_data, in_grad, aux)[source]

Backward interface. Can override when creating new operators.

Parameters
  • req (list of str) – how to assign to in_grad. can be ‘null’, ‘write’, or ‘add’. You can optionally use self.assign(dst, req, src) to handle this.

  • out_grad (list of NDArrays) – input and output for backward. See document for corresponding arguments of Operator::Backward

  • in_data (list of NDArrays) – input and output for backward. See document for corresponding arguments of Operator::Backward

  • out_data (list of NDArrays) – input and output for backward. See document for corresponding arguments of Operator::Backward

  • in_grad (list of NDArrays) – input and output for backward. See document for corresponding arguments of Operator::Backward

  • aux (list of NDArrays) – input and output for backward. See document for corresponding arguments of Operator::Backward

forward(is_train, req, in_data, out_data, aux)[source]

Quota Sampler

in_data: array-like of Symbol

[matches, ious], see below.

matchesNDArray or Symbol

Matching results, positive number for positive matching, -1 for not matched.

iousNDArray or Symbol

IOU overlaps with shape (N, M), batching is supported.

NDArray or Symbol

Sampling results with same shape as matches. 1 for positive, -1 for negative, 0 for ignore.

class gluoncv.nn.sampler.QuotaSamplerProp(num_sample, pos_thresh, neg_thresh_high=0.5, neg_thresh_low=0.0, pos_ratio=0.5, neg_ratio=None, fill_negative=True)[source]

Property for QuotaSampleOp.

Parameters
  • num_sample (int, default is 128) – Number of samples for RCNN targets.

  • pos_iou_thresh (float, default is 0.5) – Proposal whose IOU larger than pos_iou_thresh is regarded as positive samples.

  • neg_iou_thresh_high (float, default is 0.5) – Proposal whose IOU smaller than neg_iou_thresh_high and larger than neg_iou_thresh_low is regarded as negative samples. Proposals with IOU in between pos_iou_thresh and neg_iou_thresh are ignored.

  • neg_iou_thresh_low (float, default is 0.0) – See neg_iou_thresh_high.

  • pos_ratio (float, default is 0.25) – pos_ratio defines how many positive samples (pos_ratio * num_sample) is to be sampled.

  • neg_ratio (float or None) – neg_ratio defines how many negative samples (pos_ratio * num_sample) is to be sampled. If None is provided, it equals to 1 - pos_ratio.

  • fill_negative (bool) – If True, negative samples will fill the gap caused by insufficient positive samples. For example, if num_sample is 100, pos_ratio and neg_ratio are both 0.5. Available positive sample and negative samples are 10 and 10000, which are typical values. Now, the output positive samples is 10(intact), since it’s smaller than 50(100 * 0.5), the negative samples will fill the rest 40 slots. If fill_negative == False, the 40 slots is filled with -1(ignore).

create_operator(ctx, in_shapes, in_dtypes)[source]

Create an operator that carries out the real computation given the context, input shapes, and input data types.

infer_shape(in_shape)[source]

infer_shape interface. Can override when creating new operators.

Parameters

in_shape (list) – List of argument shapes in the same order as declared in list_arguments.

Returns

  • in_shape (list) – List of argument shapes. Can be modified from in_shape.

  • out_shape (list) – List of output shapes calculated from in_shape, in the same order as declared in list_outputs.

  • aux_shape (Optional, list) – List of aux shapes calculated from in_shape, in the same order as declared in list_auxiliary_states.

infer_type(in_type)[source]

infer_type interface. override to create new operators

Parameters

in_type (list of np.dtype) – list of argument types in the same order as declared in list_arguments.

Returns

  • in_type (list) – list of argument types. Can be modified from in_type.

  • out_type (list) – list of output types calculated from in_type, in the same order as declared in list_outputs.

  • aux_type (Optional, list) – list of aux types calculated from in_type, in the same order as declared in list_auxiliary_states.

list_arguments()[source]

list_arguments interface. Can override when creating new operators.

Returns

arguments – List of argument blob names.

Return type

list

list_outputs()[source]

list_outputs interface. Can override when creating new operators.

Returns

outputs – List of output blob names.

Return type

list