gluoncv.utils

We implemented a broad range of utility functions which cover visualization, file handler, download and training helpers.

Visualization

plot_image

Visualize image.

get_color_pallete

Visualize image.

plot_bbox

Visualize bounding boxes.

expand_mask

Expand instance segmentation mask to full image size.

plot_mask

Visualize segmentation mask.

plot_network

Plot network to visualize internal structures.

Miscellaneous

download

Download an given URL :param url: URL to download :type url: str :param path: Destination path to store downloaded file. By default stores to the current directory with same name as in url. :type path: str, optional :param overwrite: Whether to overwrite destination file if already exists. :type overwrite: bool, optional :param sha1_hash: Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified but doesn’t match. :type sha1_hash: str, optional.

makedirs

Create directory recursively if not exists.

seed

Seed the generator for python builtin random, numpy.random, mxnet.random.

Training Helpers

LRScheduler

Learning Rate Scheduler

set_lr_mult

Reset lr_mult to new value for all parameters that match pattern

Bounding Box Utils

bbox_iou

Calculate Intersection-Over-Union(IOU) of two bounding boxes.

API Reference

GluonCV Utility functions.

class gluoncv.utils.LRScheduler(mode, base_lr=0.1, target_lr=0, niters=0, nepochs=0, iters_per_epoch=0, offset=0, power=2, step_iter=None, step_epoch=None, step_factor=0.1, baselr=None, targetlr=None)[source]

Learning Rate Scheduler

Parameters
  • mode (str) – Modes for learning rate scheduler. Currently it supports ‘constant’, ‘step’, ‘linear’, ‘poly’ and ‘cosine’.

  • base_lr (float) – Base learning rate, i.e. the starting learning rate.

  • target_lr (float) – Target learning rate, i.e. the ending learning rate. With constant mode target_lr is ignored.

  • niters (int) – Number of iterations to be scheduled.

  • nepochs (int) – Number of epochs to be scheduled.

  • iters_per_epoch (int) – Number of iterations in each epoch.

  • offset (int) – Number of iterations before this scheduler.

  • power (float) – Power parameter of poly scheduler.

  • step_iter (list) – A list of iterations to decay the learning rate.

  • step_epoch (list) – A list of epochs to decay the learning rate.

  • step_factor (float) – Learning rate decay factor.

class gluoncv.utils.LRSequential(schedulers)[source]

Compose Learning Rate Schedulers

Parameters

schedulers (list) – list of LRScheduler objects

class gluoncv.utils.TrainingHistory(labels)[source]

Training History Record and Plot

Parameters

labels (list of str) – List of names of the labels in the history.

plot(labels=None, colors=None, y_lim=(0, 1), save_path=None, legend_loc='upper right')[source]

Update the training history

Parameters
  • labels (list of str) – List of label names to plot.

  • colors (list of str) – List of line colors.

  • save_path (str) – Path to save the plot. Will plot to screen if is None.

  • legend_loc (str) – location of legend. upper right by default.

update(values)[source]

Update the training history

Parameters

values (list of float) – List of metric scores for each label.

gluoncv.utils.bbox_iou(bbox_a, bbox_b, offset=0)[source]

Calculate Intersection-Over-Union(IOU) of two bounding boxes.

Parameters
  • bbox_a (numpy.ndarray) – An ndarray with shape \((N, 4)\).

  • bbox_b (numpy.ndarray) – An ndarray with shape \((M, 4)\).

  • offset (float or int, default is 0) – The offset is used to control the whether the width(or height) is computed as (right - left + offset). Note that the offset must be 0 for normalized bboxes, whose ranges are in [0, 1].

Returns

An ndarray with shape \((N, M)\) indicates IOU between each pairs of bounding boxes in bbox_a and bbox_b.

Return type

numpy.ndarray

gluoncv.utils.check_sha1(filename, sha1_hash)[source]

Check whether the sha1 hash of the file content matches the expected hash. :param filename: Path to the file. :type filename: str :param sha1_hash: Expected sha1 hash in hexadecimal digits. :type sha1_hash: str

Returns

Whether the file content matches the expected hash.

Return type

bool

gluoncv.utils.check_version(min_version, warning_only=False)[source]

Check the version of gluoncv satisfies the provided minimum version. An exception is thrown if the check does not pass.

Parameters
  • min_version (str) – Minimum version

  • warning_only (bool) – Printing a warning instead of throwing an exception.

gluoncv.utils.download(url, path=None, overwrite=False, sha1_hash=None)[source]

Download an given URL :param url: URL to download :type url: str :param path: Destination path to store downloaded file. By default stores to the

current directory with same name as in url.

Parameters
  • overwrite (bool, optional) – Whether to overwrite destination file if already exists.

  • sha1_hash (str, optional) – Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified but doesn’t match.

Returns

The file path of the downloaded file.

Return type

str

gluoncv.utils.export_block(path, block, data_shape=None, epoch=0, preprocess=True, layout='HWC', ctx=cpu(0))[source]

Helper function to export a HybridBlock to symbol JSON to be used by SymbolBlock.imports, mxnet.mod.Module or the C++ interface..

Parameters
  • path (str) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number.

  • block (mxnet.gluon.HybridBlock) – The hybridizable block. Note that normal gluon.Block is not supported.

  • data_shape (tuple of int, default is None) – Fake data shape just for export purpose, in format (H, W, C) for 2D data or (T, H, W, C) for 3D data. If you don’t specify data_shape, export_block will try use some common data_shapes, e.g., (224, 224, 3), (256, 256, 3), (299, 299, 3), (512, 512, 3)… If any of this data_shape goes through, the export will succeed.

  • epoch (int) – Epoch number of saved model.

  • preprocess (mxnet.gluon.HybridBlock, default is True.) – Preprocess block prior to the network. By default (True), it will subtract mean [123.675, 116.28, 103.53], divide std [58.395, 57.12, 57.375], and convert original image (B, H, W, C and range [0, 255]) to tensor (B, C, H, W) as network input. This is the default preprocess behavior of all GluonCV pre-trained models. You can use custom pre-process hybrid block or disable by set preprocess=None.

  • layout (str, default is 'HWC') – The layout for raw input data. By default is HWC. Supports ‘HWC’, ‘CHW’, ‘THWC’ and ‘CTHW’. Note that image channel order is always RGB.

  • ctx (mx.Context, default mx.cpu()) – Network context.

Returns

Return type

None

gluoncv.utils.export_tvm(path, block, data_shape, epoch=0, preprocess=True, layout='HWC', ctx=cpu(0), target='llvm', opt_level=3, use_autotvm=False)[source]

Helper function to export a HybridBlock to TVM executable. Note that tvm package needs to be installed(https://tvm.ai/).

Parameters
  • path (str) – Path to save model. Three files path_deploy_lib.tar, path_deploy_graph.json and path_deploy_xxxx.params will be created, where xxxx is the 4 digits epoch number.

  • block (mxnet.gluon.HybridBlock) – The hybridizable block. Note that normal gluon.Block is not supported.

  • data_shape (tuple of int, required) – Unlike export_block, data_shape is required here for the purpose of optimization. If dynamic shape is required, you can use the shape that most fits the inference tasks, but the optimization won’t accommodate all situations.

  • epoch (int) – Epoch number of saved model.

  • preprocess (mxnet.gluon.HybridBlock, default is True.) – Preprocess block prior to the network. By default (True), it will subtract mean [123.675, 116.28, 103.53], divide std [58.395, 57.12, 57.375], and convert original image (B, H, W, C and range [0, 255]) to tensor (B, C, H, W) as network input. This is the default preprocess behavior of all GluonCV pre-trained models. You can use custom pre-process hybrid block or disable by set preprocess=None.

  • layout (str, default is 'HWC') – The layout for raw input data. By default is HWC. Supports ‘HWC’ and ‘CHW’. Note that image channel order is always RGB.

  • ctx (mx.Context, default mx.cpu()) – Network context.

  • target (str, default is 'llvm') – Runtime type for code generation, can be (‘llvm’, ‘cuda’, ‘opencl’, ‘metal’…)

  • opt_level (int, default is 3) – TVM optimization level, if supported, higher opt_level may generate more efficient runtime library, however, some operator may not support high level optimization, which will fallback to lower opt_level.

  • use_autotvm (bool, default is False) – Use autotvm for performance tuning. Note that this can take very long time, since it’s a search and model based tuning process.

Returns

Return type

None

gluoncv.utils.freeze_bn(net, use_global_stats=True)[source]

Freeze BatchNorm layers by setting use_global_stats to True

Parameters
  • net (mxnet.gluon.Block) – The network whose BatchNorm layers are going to be modified

  • use_global_stats (bool) – The value of use_global_stats to set for all BatchNorm layers

Returns

Original network with BatchNorm layers modified.

Return type

mxnet.gluon.Block

gluoncv.utils.makedirs(path)[source]

Create directory recursively if not exists. Similar to makedir -p, you can skip checking existence before this function.

Parameters

path (str) – Path of the desired dir

gluoncv.utils.recursive_visit(net, callback, **kwargs)[source]

Recursively visit and apply callback to a net and its sub-net

Parameters
  • net (mxnet.gluon.Block) – The network to recursively visit

  • callback (function) – The callback function to apply to each net block. Its first argument needs to be the block

gluoncv.utils.set_lr_mult(net, pattern, mult=1.0, verbose=False)[source]

Reset lr_mult to new value for all parameters that match pattern

Parameters
  • net (mxnet.gluon.Block) – The network whose parameters are going to be adjusted.

  • pattern (str) – Regex matching pattern for targeting parameters.

  • mult (float, default 1.0) – The new learning rate multiplier.

  • verbose (bool) – Print which parameters being modified if set True.

Returns

Original network with learning rate multipliers modified.

Return type

mxnet.gluon.Block

gluoncv.utils.split_and_load(data, ctx_list, batch_axis=0, even_split=True, multiplier=1)[source]

Splits an NDArray into len(ctx_list) slices along batch_axis and loads each slice to one context in ctx_list.

Parameters
  • data (NDArray) – A batch of data.

  • ctx_list (list of Context) – A list of Contexts.

  • batch_axis (int, default 0) – The axis along which to slice.

  • even_split (bool, default True) – Whether to force all slices to have the same number of elements.

  • multiplier (int, default 1) – The batch size has to be the multiples of channel multiplier. Need to investigate further.

Returns

Each corresponds to a context in ctx_list.

Return type

list of NDArray

gluoncv.utils.split_data(data, num_slice, batch_axis=0, even_split=True, multiplier=1)[source]

Splits an NDArray into num_slice slices along batch_axis. Usually used for data parallelism where each slices is sent to one device (i.e. GPU).

Parameters
  • data (NDArray) – A batch of data.

  • num_slice (int) – Number of desired slices.

  • batch_axis (int, default 0) – The axis along which to slice.

  • even_split (bool, default True) – Whether to force all slices to have the same number of elements. If True, an error will be raised when num_slice does not evenly divide data.shape[batch_axis].

  • multiplier (int, default 1) – The batch size has to be the multiples of multiplier

Returns

Return value is a list even if num_slice is 1.

Return type

list of NDArray

gluoncv.utils.try_import_cv2()[source]

Try import cv2 at runtime.

Returns

Return type

cv2 module if found. Raise ImportError otherwise

gluoncv.utils.try_import_dali()[source]

Try import NVIDIA DALI at runtime.

Visualization tools

class gluoncv.utils.viz.DeNormalize(mean, std)[source]

Denormalize the image

hybrid_forward(F, x)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

gluoncv.utils.viz.cv_merge_two_images(img1, img2, alpha=0.5, size=None)[source]

Merge two images with OpoenCV.

Parameters
  • img1 (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • img2 (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • alpha (float, optional, default 0.5) – Transparency of img2

  • size (list, optional, default None) – The output size of the merged image

Returns

The merged image

Return type

numpy.ndarray

gluoncv.utils.viz.cv_plot_bbox(img, bboxes, scores=None, labels=None, thresh=0.5, class_names=None, colors=None, absolute_coordinates=True, scale=1.0, linewidth=2)[source]

Visualize bounding boxes with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • labels (numpy.ndarray or mxnet.nd.NDArray, optional) – Class labels of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • class_names (list of str, optional) – Description of parameter class_names.

  • colors (dict, optional) – You can provide desired colors as {0: (255, 0, 0), 1:(0, 255, 0), …}, otherwise random colors will be substituted.

  • absolute_coordinates (bool) – If True, absolute coordinates will be considered, otherwise coordinates are interpreted as in range(0, 1).

  • scale (float) – The scale of output image, which may affect the positions of boxes

  • linewidth (int, optional, default 2) – Line thickness for bounding boxes. Use negative values to fill the bounding boxes.

Returns

The image with detected results.

Return type

numpy.ndarray

gluoncv.utils.viz.cv_plot_image(img, scale=1, upperleft_txt=None, upperleft_txt_corner=(10, 100), left_txt_list=None, left_txt_corner=(10, 150), title_txt_list=None, title_txt_corner=(500, 50), canvas_name='demo')[source]

Visualize image with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • scale (float) – The scaling factor of the output image

  • upperleft_txt (str, optional, default is None) – If presents, will print the string at the upperleft corner

  • upperleft_txt_corner (tuple, optional, default is (10, 100)) – The bottomleft corner of upperleft_txt

  • left_txt_list (list of str, optional, default is None) – If presents, will print each string in the list close to the left

  • left_txt_corner (tuple, optional, default is (10, 150)) – The bottomleft corner of left_txt_list

  • title_txt_list (list of str, optional, default is None) – If presents, will print each string in the list close to the top

  • title_txt_corner (tuple, optional, default is (500, 50)) – The bottomleft corner of title_txt_list

  • canvas_name (str, optional, default is 'demo') – The name of the canvas to plot the image

Examples

from matplotlib import pyplot as plt ax = plot_image(img) plt.show()

gluoncv.utils.viz.cv_plot_keypoints(img, coords, confidence, class_ids, bboxes, scores, box_thresh=0.5, keypoint_thresh=0.2, scale=1.0, **kwargs)[source]

Visualize keypoints with OpenCV.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • coords (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 2.

  • confidence (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 1.

  • class_ids (numpy.ndarray or mxnet.nd.NDArray) – Class IDs.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • box_thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than box_thresh will be ignored in display.

  • keypoint_thresh (float, optional, default 0.2) – Keypoints with confidence less than keypoint_thresh will be ignored in display.

  • scale (float) – The scale of output image, which may affect the positions of boxes

Returns

The image with estimated pose.

Return type

numpy.ndarray

gluoncv.utils.viz.expand_mask(masks, bboxes, im_shape, scores=None, thresh=0.5, scale=1.0, sortby=None)[source]

Expand instance segmentation mask to full image size.

Parameters
  • masks (numpy.ndarray or mxnet.nd.NDArray) – Binary images with shape N, M, M

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes

  • im_shape (tuple) – Tuple of length 2: (width, height)

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • sortby (str, optional, default None) – If not None, sort the color palette for masks by the given attributes of each bounding box. Valid inputs are ‘area’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’.

  • scale (float) – The scale of output image, which may affect the positions of boxes

Returns

  • numpy.ndarray – Binary images with shape N, height, width

  • numpy.ndarray – Index array of sorted masks

gluoncv.utils.viz.get_color_pallete(npimg, dataset='pascal_voc')[source]

Visualize image.

Parameters
  • npimg (numpy.ndarray) – Single channel image with shape H, W, 1.

  • dataset (str, default: 'pascal_voc') – The dataset that model pretrained on. (‘pascal_voc’, ‘ade20k’)

Returns

out_img – Image with color pallete

Return type

PIL.Image

gluoncv.utils.viz.plot_bbox(img, bboxes, scores=None, labels=None, thresh=0.5, class_names=None, colors=None, ax=None, reverse_rgb=False, absolute_coordinates=True, linewidth=3.5, fontsize=12)[source]

Visualize bounding boxes.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • labels (numpy.ndarray or mxnet.nd.NDArray, optional) – Class labels of the provided bboxes with shape N.

  • thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than thresh will be ignored in display, this is visually more elegant if you have a large number of bounding boxes with very small scores.

  • class_names (list of str, optional) – Description of parameter class_names.

  • colors (dict, optional) – You can provide desired colors as {0: (255, 0, 0), 1:(0, 255, 0), …}, otherwise random colors will be substituted.

  • ax (matplotlib axes, optional) – You can reuse previous axes if provided.

  • reverse_rgb (bool, optional) – Reverse RGB<->BGR orders if True.

  • absolute_coordinates (bool) – If True, absolute coordinates will be considered, otherwise coordinates are interpreted as in range(0, 1).

  • linewidth (float, optional, default 3.5) – Line thickness for bounding boxes.

  • fontsize (int, optional, default 12) – Font size for display of class labels and threshold.

Returns

The ploted axes.

Return type

matplotlib axes

gluoncv.utils.viz.plot_image(img, ax=None, reverse_rgb=False)[source]

Visualize image.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • ax (matplotlib axes, optional) – You can reuse previous axes if provided.

  • reverse_rgb (bool, optional) – Reverse RGB<->BGR orders if True.

Returns

The ploted axes.

Return type

matplotlib axes

Examples

from matplotlib import pyplot as plt ax = plot_image(img) plt.show()

gluoncv.utils.viz.plot_keypoints(img, coords, confidence, class_ids, bboxes, scores, box_thresh=0.5, keypoint_thresh=0.2, **kwargs)[source]

Visualize keypoints.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • coords (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 2.

  • confidence (numpy.ndarray or mxnet.nd.NDArray) – Array with shape Batch, N_Joints, 1.

  • class_ids (numpy.ndarray or mxnet.nd.NDArray) – Class IDs.

  • bboxes (numpy.ndarray or mxnet.nd.NDArray) – Bounding boxes with shape N, 4. Where N is the number of boxes.

  • scores (numpy.ndarray or mxnet.nd.NDArray, optional) – Confidence scores of the provided bboxes with shape N.

  • box_thresh (float, optional, default 0.5) – Display threshold if scores is provided. Scores with less than box_thresh will be ignored in display.

  • keypoint_thresh (float, optional, default 0.2) – Keypoints with confidence less than keypoint_thresh will be ignored in display.

Returns

The ploted axes.

Return type

matplotlib axes

gluoncv.utils.viz.plot_mask(img, masks, alpha=0.5)[source]

Visualize segmentation mask.

Parameters
  • img (numpy.ndarray or mxnet.nd.NDArray) – Image with shape H, W, 3.

  • masks (numpy.ndarray or mxnet.nd.NDArray) – Binary images with shape N, H, W.

  • alpha (float, optional, default 0.5) – Transparency of plotted mask

Returns

The image plotted with segmentation masks

Return type

numpy.ndarray

gluoncv.utils.viz.plot_mxboard(block, logdir='./logs')[source]

Plot network to visualize internal structures.

Parameters
  • block (mxnet.gluon.HybridBlock) – A hybridizable network to be visualized.

  • logdir (str) – The directory to save.

gluoncv.utils.viz.plot_network(block, shape=(1, 3, 224, 224), save_prefix=None)[source]

Plot network to visualize internal structures.

Parameters
  • block (mxnet.gluon.HybridBlock) – A hybridizable network to be visualized.

  • shape (tuple of int) – Desired input shape, default is (1, 3, 224, 224).

  • save_prefix (str or None) – If not None, will save rendered pdf to disk with prefix.

Custom evaluation metrics

class gluoncv.utils.metrics.COCODetectionMetric(dataset, save_prefix, use_time=True, cleanup=False, score_thresh=0.05, data_shape=None, post_affine=None)[source]

Detection metric for COCO bbox task.

Parameters
  • dataset (instance of gluoncv.data.COCODetection) – The validation dataset.

  • save_prefix (str) – Prefix for the saved JSON results.

  • use_time (bool) – Append unique datetime string to created JSON file name if True.

  • cleanup (bool) – Remove created JSON file if True.

  • score_thresh (float) – Detection results with confident scores smaller than score_thresh will be discarded before saving to results.

  • data_shape (tuple of int, default is None) – If data_shape is provided as (height, width), we will rescale bounding boxes when saving the predictions. This is helpful when SSD/YOLO box predictions cannot be rescaled conveniently. Note that the data_shape must be fixed for all validation images.

  • post_affine (a callable function with input signature (orig_w, orig_h, out_w, out_h)) – If not None, the bounding boxes will be affine transformed rather than simply scaled.

get()[source]

Get evaluation metrics.

reset()[source]

Resets the internal evaluation result to initial state.

update(pred_bboxes, pred_labels, pred_scores, *args, **kwargs)[source]

Update internal buffer with latest predictions. Note that the statistics are not available until you call self.get() to return the metrics.

Parameters
  • pred_bboxes (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes with shape B, N, 4. Where B is the size of mini-batch, N is the number of bboxes.

  • pred_labels (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes labels with shape B, N.

  • pred_scores (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes scores with shape B, N.

class gluoncv.utils.metrics.COCOKeyPointsMetric(dataset, save_prefix, use_time=True, cleanup=False, in_vis_thresh=0.2, data_shape=None)[source]

Detection metric for COCO bbox task.

Parameters
  • dataset (instance of gluoncv.data.COCODetection) – The validation dataset.

  • save_prefix (str) – Prefix for the saved JSON results.

  • use_time (bool) – Append unique datetime string to created JSON file name if True.

  • cleanup (bool) – Remove created JSON file if True.

  • in_vis_thresh (float) – Detection results with confident scores smaller than in_vis_thresh will be discarded before saving to results.

  • data_shape (tuple of int, default is None) – If data_shape is provided as (height, width), we will rescale bounding boxes when saving the predictions. This is helpful when SSD/YOLO box predictions cannot be rescaled conveniently. Note that the data_shape must be fixed for all validation images.

get()[source]

Get evaluation metrics.

reset()[source]

Resets the internal evaluation result to initial state.

update(preds, maxvals, score, imgid, *args, **kwargs)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

class gluoncv.utils.metrics.HeatmapAccuracy(axis=1, name='heatmap_accuracy', hm_type='gaussian', threshold=0.5, output_names=None, label_names=None, ignore_labels=None)[source]

Computes heatmap accuracy for keypoint :param axis: The axis that represents classes :type axis: int, default=1 :param name: Name of this metric instance for display. :type name: str :param output_names: Name of predictions that should be used when updating with update_dict.

By default include all predictions.

Parameters
  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • ignore_labels (int or iterable of integers, optional) – If provided as not None, will ignore these labels during update.

Examples

>>> predicts = [mx.nd.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.nd.array([0, 1, 1])]
>>> acc = mx.metric.Accuracy()
>>> acc.update(preds = predicts, labels = labels)
>>> print acc.get()
('accuracy', 0.6666666666666666)
update(labels, preds)[source]

Updates the internal evaluation result. :param labels: The labels of the data with class indices as values, one per sample. :type labels: list of NDArray :param preds: Prediction values for samples. Each prediction value can either be the class index,

or a vector of likelihoods for all classes.

class gluoncv.utils.metrics.SegmentationMetric(nclass)[source]

Computes pixAcc and mIoU metric scores

get()[source]

Gets the current evaluation result.

Returns

metrics – pixAcc and mIoU

Return type

tuple of float

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (‘NDArray’ or list of NDArray) – The labels of the data.

  • preds (‘NDArray’ or list of NDArray) – Predicted values.

class gluoncv.utils.metrics.VOC07MApMetric(*args, **kwargs)[source]

Mean average precision metric for PASCAL V0C 07 dataset

iou_threshfloat

IOU overlap threshold for TP

class_nameslist of str

optional, if provided, will print out AP for each class

class gluoncv.utils.metrics.VOCMApMetric(iou_thresh=0.5, class_names=None)[source]

Calculate mean AP for object detection task

iou_threshfloat

IOU overlap threshold for TP

class_nameslist of str

optional, if provided, will print out AP for each class

get()[source]

Get the current evaluation result.

Returns

  • name (str) – Name of the metric.

  • value (float) – Value of the evaluation.

reset()[source]

Clear the internal statistics to initial state.

update(pred_bboxes, pred_labels, pred_scores, gt_bboxes, gt_labels, gt_difficults=None)[source]

Update internal buffer with latest prediction and gt pairs.

Parameters
  • pred_bboxes (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes with shape B, N, 4. Where B is the size of mini-batch, N is the number of bboxes.

  • pred_labels (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes labels with shape B, N.

  • pred_scores (mxnet.NDArray or numpy.ndarray) – Prediction bounding boxes scores with shape B, N.

  • gt_bboxes (mxnet.NDArray or numpy.ndarray) – Ground-truth bounding boxes with shape B, M, 4. Where B is the size of mini-batch, M is the number of ground-truths.

  • gt_labels (mxnet.NDArray or numpy.ndarray) – Ground-truth bounding boxes labels with shape B, M.

  • gt_difficults (mxnet.NDArray or numpy.ndarray, optional, default is None) – Ground-truth bounding boxes difficulty labels with shape B, M.