Table Of Contents
Table Of Contents

gluoncv.utils

We implemented a broad range of utility functions which cover visualization, file handler, download and training helpers.

Visualization

plot_image

Visualize image.

get_color_pallete

Visualize image.

plot_bbox

Visualize bounding boxes.

expand_mask

Expand instance segmentation mask to full image size.

plot_mask

Visualize segmentation mask.

plot_network

Plot network to visualize internal structures.

Miscellaneous

download

Download an given URL :param url: URL to download :type url: str :param path: Destination path to store downloaded file.

makedirs

Create directory recursively if not exists.

seed

Seed the generator for python builtin random, numpy.random, mxnet.random.

Training Helpers

LRScheduler

Learning Rate Scheduler

set_lr_mult

Reset lr_mult to new value for all parameters that match pattern

Bounding Box Utils

bbox_iou

Calculate Intersection-Over-Union(IOU) of two bounding boxes.

API Reference

GluonCV Utility functions.

class gluoncv.utils.LRScheduler(mode, base_lr=0.1, target_lr=0, niters=0, nepochs=0, iters_per_epoch=0, offset=0, power=2, step_iter=None, step_epoch=None, step_factor=0.1, baselr=None, targetlr=None)[source]

Learning Rate Scheduler

Parameters
  • mode (str) – Modes for learning rate scheduler. Currently it supports ‘constant’, ‘step’, ‘linear’, ‘poly’ and ‘cosine’.

  • base_lr (float) – Base learning rate, i.e. the starting learning rate.

  • target_lr (float) – Target learning rate, i.e. the ending learning rate. With constant mode target_lr is ignored.

  • niters (int) – Number of iterations to be scheduled.

  • nepochs (int) – Number of epochs to be scheduled.

  • iters_per_epoch (int) – Number of iterations in each epoch.

  • offset (int) – Number of iterations before this scheduler.

  • power (float) – Power parameter of poly scheduler.

  • step_iter (list) – A list of iterations to decay the learning rate.

  • step_epoch (list) – A list of epochs to decay the learning rate.

  • step_factor (float) – Learning rate decay factor.

class gluoncv.utils.LRSequential(schedulers)[source]

Compose Learning Rate Schedulers

Parameters

schedulers (list) – list of LRScheduler objects

class gluoncv.utils.TrainingHistory(labels)[source]

Training History Record and Plot

Parameters

labels (list of str) – List of names of the labels in the history.

plot(labels=None, colors=None, y_lim=(0, 1), save_path=None, legend_loc='upper right')[source]

Update the training history

Parameters
  • labels (list of str) – List of label names to plot.

  • colors (list of str) – List of line colors.

  • save_path (str) – Path to save the plot. Will plot to screen if is None.

  • legend_loc (str) – location of legend. upper right by default.

update(values)[source]

Update the training history

Parameters

values (list of float) – List of metric scores for each label.

gluoncv.utils.bbox_iou(bbox_a, bbox_b, offset=0)[source]

Calculate Intersection-Over-Union(IOU) of two bounding boxes.

Parameters
  • bbox_a (numpy.ndarray) – An ndarray with shape \((N, 4)\).

  • bbox_b (numpy.ndarray) – An ndarray with shape \((M, 4)\).

  • offset (float or int, default is 0) – The offset is used to control the whether the width(or height) is computed as (right - left + offset). Note that the offset must be 0 for normalized bboxes, whose ranges are in [0, 1].

Returns

An ndarray with shape \((N, M)\) indicates IOU between each pairs of bounding boxes in bbox_a and bbox_b.

Return type

numpy.ndarray

gluoncv.utils.check_sha1(filename, sha1_hash)[source]

Check whether the sha1 hash of the file content matches the expected hash. :param filename: Path to the file. :type filename: str :param sha1_hash: Expected sha1 hash in hexadecimal digits. :type sha1_hash: str

Returns

Whether the file content matches the expected hash.

Return type

bool

gluoncv.utils.download(url, path=None, overwrite=False, sha1_hash=None)[source]

Download an given URL :param url: URL to download :type url: str :param path: Destination path to store downloaded file. By default stores to the

current directory with same name as in url.

Parameters
  • overwrite (bool, optional) – Whether to overwrite destination file if already exists.

  • sha1_hash (str, optional) – Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified but doesn’t match.

Returns

The file path of the downloaded file.

Return type

str

gluoncv.utils.export_block(path, block, data_shape=None, epoch=0, preprocess=True, layout='HWC', ctx=cpu(0))[source]

Helper function to export a HybridBlock to symbol JSON to be used by SymbolBlock.imports, mxnet.mod.Module or the C++ interface..

Parameters
  • path (str) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number.

  • block (mxnet.gluon.HybridBlock) – The hybridizable block. Note that normal gluon.Block is not supported.

  • data_shape (tuple of int, default is None) – Fake data shape just for export purpose, in format (H, W, C). If you don’t specify data_shape, export_block will try use some common data_shapes, e.g., (224, 224, 3), (256, 256, 3), (299, 299, 3), (512, 512, 3)… If any of this data_shape goes through, the export will succeed.

  • epoch (int) – Epoch number of saved model.

  • preprocess (mxnet.gluon.HybridBlock, default is True.) – Preprocess block prior to the network. By default (True), it will subtract mean [123.675, 116.28, 103.53], divide std [58.395, 57.12, 57.375], and convert original image (B, H, W, C and range [0, 255]) to tensor (B, C, H, W) as network input. This is the default preprocess behavior of all GluonCV pre-trained models. You can use custom pre-process hybrid block or disable by set preprocess=None.

  • layout (str, default is 'HWC') – The layout for raw input data. By default is HWC. Supports ‘HWC’ and ‘CHW’. Note that image channel order is always RGB.

  • ctx (mx.Context, default mx.cpu()) – Network context.

Returns

Return type

None

gluoncv.utils.freeze_bn(net, use_global_stats=True)[source]

Freeze BatchNorm layers by setting use_global_stats to True

Parameters
  • net (mxnet.gluon.Block) – The network whose BatchNorm layers are going to be modified

  • use_global_stats (bool) – The value of use_global_stats to set for all BatchNorm layers

Returns

Original network with BatchNorm layers modified.

Return type

mxnet.gluon.Block

gluoncv.utils.makedirs(path)[source]

Create directory recursively if not exists. Similar to makedir -p, you can skip checking existence before this function.

Parameters

path (str) – Path of the desired dir

gluoncv.utils.recursive_visit(net, callback, **kwargs)[source]

Recursively visit and apply callback to a net and its sub-net

Parameters
  • net (mxnet.gluon.Block) – The network to recursively visit

  • callback (function) – The callback function to apply to each net block. Its first argument needs to be the block

gluoncv.utils.set_lr_mult(net, pattern, mult=1.0, verbose=False)[source]

Reset lr_mult to new value for all parameters that match pattern

Parameters
  • net (mxnet.gluon.Block) – The network whose parameters are going to be adjusted.

  • pattern (str) – Regex matching pattern for targeting parameters.

  • mult (float, default 1.0) – The new learning rate multiplier.

  • verbose (bool) – Print which parameters being modified if set True.

Returns

Original network with learning rate multipliers modified.

Return type

mxnet.gluon.Block

gluoncv.utils.try_import_dali()[source]

Try import NVIDIA DALI at runtime.

Visualization tools

class gluoncv.utils.viz.DeNormalize(mean, std)[source]

Denormalize the image

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

gluoncv.utils.viz.cv_merge_two_images(img1, img2, alpha=0.5, size=None)[source]

Merge two images with OpoenCV.

Parameters
  • img1 (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • img2 (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • alpha (float, optional, default 0.5) – Transparency of img2

  • size (list, optional, default None) – The output size of the merged image

Returns

The merged image

Return type

numpy.ndarray

gluoncv.utils.viz.cv_plot_bbox(img, bboxes, scores=None, labels=None, thresh=0.5, class_names=None, colors=None, absolute_coordinates=True, scale=1.0)[source]

Visualize bounding boxes with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • labels (numpy.ndarray or mxnet.nd.NDArray, optional) – Class labels of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • class_names (list of str, optional) – Description of parameter class_names.

  • colors (dict, optional) – You can provide desired colors as {0: (255, 0, 0), 1:(0, 255, 0), …}, otherwise random colors will be substituted.

  • absolute_coordinates (bool) – If True, absolute coordinates will be considered, otherwise coordinates are interpreted as in range(0, 1).

  • scale (float) – The scale of output image, which may affect the positions of boxes

Returns

The image with detected results.

Return type

numpy.ndarray

gluoncv.utils.viz.cv_plot_image(img, scale=1, upperleft_txt=None, upperleft_txt_corner=(10, 100), left_txt_list=None, left_txt_corner=(10, 150), title_txt_list=None, title_txt_corner=(500, 50), canvas_name='demo')[source]

Visualize image with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • scale (float) – The scaling factor of the output image

  • upperleft_txt (str, optional, default is None) – If presents, will print the string at the upperleft corner

  • upperleft_txt_corner (tuple, optional, default is (10, 100)) – The bottomleft corner of upperleft_txt

  • left_txt_list (list of str, optional, default is None) – If presents, will print each string in the list close to the left

  • left_txt_corner (tuple, optional, default is (10, 150)) – The bottomleft corner of left_txt_list

  • title_txt_list (list of str, optional, default is None) – If presents, will print each string in the list close to the top

  • title_txt_corner (tuple, optional, default is (500, 50)) – The bottomleft corner of title_txt_list

  • canvas_name (str, optional, default is 'demo') – The name of the canvas to plot the image

Examples

from matplotlib import pyplot as plt ax = plot_image(img) plt.show()

gluoncv.utils.viz.cv_plot_keypoints(img, coords, confidence, class_ids, bboxes, scores, box_thresh=0.5, keypoint_thresh=0.2, scale=1.0, **kwargs)[source]

Visualize keypoints with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • coords (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 2.

  • confidence (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 1.

  • class_ids (numpy.ndarray or mxnet.nd.NDArray) – Class IDs.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • box_thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than box_thresh will be ignored in display.

  • keypoint_thresh (float, optional, default 0.2) – Keypoints with confidence less than keypoint_thresh will be ignored in display.

  • scale (float) – The scale of output image, which may affect the positions of boxes

Returns

The image with estimated pose.

Return type

numpy.ndarray

gluoncv.utils.viz.expand_mask(masks, bboxes, im_shape, scores=None, thresh=0.5, scale=1.0, sortby=None)[source]

Expand instance segmentation mask to full image size.

Parameters
  • masks (numpy.ndarray or mxnet.nd.NDArray) – Binary images with shape N, M, M

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes

  • im_shape (tuple) – Tuple of length 2: (width, height)

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • sortby (str, optional, default None) – If not None, sort the color palette for masks by the given attributes of each bounding box. Valid inputs are ‘area’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’.

  • scale (float) – The scale of output image, which may affect the positions of boxes

Returns

  • numpy.ndarray – Binary images with shape N, height, width

  • numpy.ndarray – Index array of sorted masks

gluoncv.utils.viz.get_color_pallete(npimg, dataset='pascal_voc')[source]

Visualize image.

Parameters
  • npimg (numpy.ndarray) – Single channel image with shape H, W, 1.

  • dataset (str, default: 'pascal_voc') – The dataset that model pretrained on. (‘pascal_voc’, ‘ade20k’)

Returns

out_img – Image with color pallete

Return type

PIL.Image

gluoncv.utils.viz.plot_bbox(img, bboxes, scores=None, labels=None, thresh=0.5, class_names=None, colors=None, ax=None, reverse_rgb=False, absolute_coordinates=True)[source]

Visualize bounding boxes.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • labels (numpy.ndarray or mxnet.nd.NDArray, optional) – Class labels of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • class_names (list of str, optional) – Description of parameter class_names.

  • colors (dict, optional) – You can provide desired colors as {0: (255, 0, 0), 1:(0, 255, 0), …}, otherwise random colors will be substituted.

  • ax (matplotlib axes, optional) – You can reuse previous axes if provided.

  • reverse_rgb (bool, optional) – Reverse RGB<->BGR orders if True.

  • absolute_coordinates (bool) – If True, absolute coordinates will be considered, otherwise coordinates are interpreted as in range(0, 1).

Returns

The ploted axes.

Return type

matplotlib axes

gluoncv.utils.viz.plot_image(img, ax=None, reverse_rgb=False)[source]

Visualize image.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • ax (matplotlib axes, optional) – You can reuse previous axes if provided.

  • reverse_rgb (bool, optional) – Reverse RGB<->BGR orders if True.

Returns

The ploted axes.

Return type

matplotlib axes

Examples

from matplotlib import pyplot as plt ax = plot_image(img) plt.show()

gluoncv.utils.viz.plot_keypoints(img, coords, confidence, class_ids, bboxes, scores, box_thresh=0.5, keypoint_thresh=0.2, **kwargs)[source]

Visualize keypoints.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • coords (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 2.

  • confidence (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 1.

  • class_ids (numpy.ndarray or mxnet.nd.NDArray) – Class IDs.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • box_thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than box_thresh will be ignored in display.

  • keypoint_thresh (float, optional, default 0.2) – Keypoints with confidence less than keypoint_thresh will be ignored in display.

Returns

The ploted axes.

Return type

matplotlib axes

gluoncv.utils.viz.plot_mask(img, masks, alpha=0.5)[source]

Visualize segmentation mask.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • masks (numpy.ndarray or mxnet.nd.NDArray) – Binary images with shape N, H, W.

  • alpha (float, optional, default 0.5) – Transparency of plotted mask

Returns

The image plotted with segmentation masks

Return type

numpy.ndarray

gluoncv.utils.viz.plot_mxboard(block, logdir='./logs')[source]

Plot network to visualize internal structures.

Parameters
gluoncv.utils.viz.plot_network(block, shape=(1, 3, 224, 224), save_prefix=None)[source]

Plot network to visualize internal structures.

Parameters
  • block (mxnet.gluon.HybridBlock) – A hybridizable network to be visualized.

  • shape (tuple of int) – Desired input shape, default is (1, 3, 224, 224).

  • save_prefix (str or None) – If not None, will save rendered pdf to disk with prefix.

Custom evaluation metrics

class gluoncv.utils.metrics.COCODetectionMetric(dataset, save_prefix, use_time=True, cleanup=False, score_thresh=0.05, data_shape=None)[source]

Detection metric for COCO bbox task.

Parameters
  • dataset (instance of gluoncv.data.COCODetection) – The validation dataset.

  • save_prefix (str) – Prefix for the saved JSON results.

  • use_time (bool) – Append unique datetime string to created JSON file name if True.

  • cleanup (bool) – Remove created JSON file if True.

  • score_thresh (float) – Detection results with confident scores smaller than score_thresh will be discarded before saving to results.

  • data_shape (tuple of int, default is None) – If data_shape is provided as (height, width), we will rescale bounding boxes when saving the predictions. This is helpful when SSD/YOLO box predictions cannot be rescaled conveniently. Note that the data_shape must be fixed for all validation images.

get()[source]

Get evaluation metrics.

reset()[source]

Resets the internal evaluation result to initial state.

update(pred_bboxes, pred_labels, pred_scores, *args, **kwargs)[source]

Update internal buffer with latest predictions. Note that the statistics are not available until you call self.get() to return the metrics.

Parameters
  • pred_bboxes (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes with shape B, N, 4. Where B is the size of mini-batch, N is the number of bboxes.

  • pred_labels (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes labels with shape B, N.

  • pred_scores (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes scores with shape B, N.

class gluoncv.utils.metrics.COCOKeyPointsMetric(dataset, save_prefix, use_time=True, cleanup=False, in_vis_thresh=0.2, data_shape=None)[source]

Detection metric for COCO bbox task.

Parameters
  • dataset (instance of gluoncv.data.COCODetection) – The validation dataset.

  • save_prefix (str) – Prefix for the saved JSON results.

  • use_time (bool) – Append unique datetime string to created JSON file name if True.

  • cleanup (bool) – Remove created JSON file if True.

  • in_vis_thresh (float) – Detection results with confident scores smaller than in_vis_thresh will be discarded before saving to results.

  • data_shape (tuple of int, default is None) – If data_shape is provided as (height, width), we will rescale bounding boxes when saving the predictions. This is helpful when SSD/YOLO box predictions cannot be rescaled conveniently. Note that the data_shape must be fixed for all validation images.

get()[source]

Get evaluation metrics.

reset()[source]

Resets the internal evaluation result to initial state.

update(preds, maxvals, score, imgid, *args, **kwargs)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

class gluoncv.utils.metrics.HeatmapAccuracy(axis=1, name='heatmap_accuracy', hm_type='gaussian', threshold=0.5, output_names=None, label_names=None, ignore_labels=None)[source]

Computes accuracy classification score with optional ignored labels. The accuracy score is defined as .. math:

\text{accuracy}(y, \hat{y}) = \frac{1}{n} \sum_{i=0}^{n-1}
\text{1}(\hat{y_i} == y_i)
Parameters
  • axis (int, default=1) – The axis that represents classes

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • ignore_labels (int or iterable of integers, optional) – If provided as not None, will ignore these labels during update.

Examples

>>> predicts = [mx.nd.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.nd.array([0, 1, 1])]
>>> acc = mx.metric.Accuracy()
>>> acc.update(preds = predicts, labels = labels)
>>> print acc.get()
('accuracy', 0.6666666666666666)
update(labels, preds)[source]

Updates the internal evaluation result. :param labels: The labels of the data with class indices as values, one per sample. :type labels: list of NDArray :param preds: Prediction values for samples. Each prediction value can either be the class index,

or a vector of likelihoods for all classes.

class gluoncv.utils.metrics.SegmentationMetric(nclass)[source]

Computes pixAcc and mIoU metric scores

get()[source]

Gets the current evaluation result.

Returns

metrics – pixAcc and mIoU

Return type

tuple of float

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (‘NDArray’ or list of NDArray) – The labels of the data.

  • preds (‘NDArray’ or list of NDArray) – Predicted values.

class gluoncv.utils.metrics.VOC07MApMetric(*args, **kwargs)[source]

Mean average precision metric for PASCAL V0C 07 dataset

iou_threshfloat

IOU overlap threshold for TP

class_nameslist of str

optional, if provided, will print out AP for each class

class gluoncv.utils.metrics.VOCMApMetric(iou_thresh=0.5, class_names=None)[source]

Calculate mean AP for object detection task

iou_threshfloat

IOU overlap threshold for TP

class_nameslist of str

optional, if provided, will print out AP for each class

get()[source]

Get the current evaluation result.

Returns

  • name (str) – Name of the metric.

  • value (float) – Value of the evaluation.

reset()[source]

Clear the internal statistics to initial state.

update(pred_bboxes, pred_labels, pred_scores, gt_bboxes, gt_labels, gt_difficults=None)[source]

Update internal buffer with latest prediction and gt pairs.

Parameters
  • pred_bboxes (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes with shape B, N, 4. Where B is the size of mini-batch, N is the number of bboxes.

  • pred_labels (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes labels with shape B, N.

  • pred_scores (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes scores with shape B, N.

  • gt_bboxes (mxnet.NDArray or numpy.ndarray) – Ground-truth bounding boxes with shape B, M, 4. Where B is the size of mini-batch, M is the number of ground-truths.

  • gt_labels (mxnet.NDArray or numpy.ndarray) – Ground-truth bounding boxes labels with shape B, M.

  • gt_difficults (mxnet.NDArray or numpy.ndarray, optional, default is None) – Ground-truth bounding boxes difficulty labels with shape B, M.