gluoncv.data

This module provides data loaders and transfomers for popular vision datasets.

Hint

Please refer to Prepare Datasets for the description of the datasets listed in this page, and how to download and extract them.

Hint

For small dataset such as MNIST and CIFAR10, please refer to Gluon Vision Datasets, which can be used directly without any downloading step.

ImageNet

gluoncv.data.ImageNet Load the ImageNet classification dataset.

Pascal VOC

gluoncv.data.VOCDetection Pascal VOC detection Dataset.
gluoncv.data.VOCSegmentation Pascal VOC Semantic Segmentation Dataset.
gluoncv.data.VOCAugSegmentation Pascal VOC Augmented Semantic Segmentation Dataset.

COCO

gluoncv.data.COCODetection MS COCO detection dataset.

ADE20K

gluoncv.data.ADE20KSegmentation ADE20K Semantic Segmentation Dataset.

API Reference

class gluoncv.data.ImageNet(root='~/.mxnet/datasets/imagenet', train=True, transform=None)[source]

Load the ImageNet classification dataset.

Refer to Prepare the ImageNet dataset for the description of this dataset and how to prepare it.

Parameters:
  • root (str, default '~/.mxnet/datasets/imagenet') – Path to the folder stored the dataset.
  • train (bool, default True) – Whether to load the training or validation set.
  • transform (function, default None) – A function that takes data and label and transforms them. Refer to ./transforms for examples. (TODO, should we restrict its datatype to transformer?)
class gluoncv.data.VOCDetection(root='~/.mxnet/datasets/voc', splits=((2007, 'trainval'), (2012, 'trainval')), transform=None, index_map=None, preload_label=True)[source]

Pascal VOC detection Dataset.

Parameters:
  • root (str, default '~/mxnet/datasets/voc') – Path to folder storing the dataset.
  • splits (list of tuples, default ((2007, 'trainval'), (2012, 'trainval'))) – List of combinations of (year, name) For years, candidates can be: 2007, 2012. For names, candidates can be: ‘train’, ‘val’, ‘trainval’, ‘test’.
  • transform (callable, defaut None) –

    A function that takes data and label and transforms them. Refer to ./transforms for examples.

    A transform function for object detection should take label into consideration, because any geometric modification will require label to be modified.

  • index_map (dict, default None) – In default, the 20 classes are mapped into indices from 0 to 19. We can customize it by providing a str to int dict specifying how to map class names to indicies. Use by advanced users only, when you want to swap the orders of class labels.
  • preload_label (bool, default True) – If True, then parse and load all labels into memory during initialization. It often accelerate speed but require more memory usage. Typical preloaded labels took tens of MB. You only need to disable it when your dataset is extreamly large.
class gluoncv.data.VOCSegmentation(root='/var/lib/jenkins/.mxnet/datasets/voc', split='train', mode=None, transform=None)[source]

Pascal VOC Semantic Segmentation Dataset.

Parameters:
  • root (string) – Path to VOCdevkit folder. Default is ‘$(HOME)/mxnet/datasets/ade’
  • split (string) – ‘train’, ‘val’ or ‘test’
  • transform (callable, optional) – A function that transforms the image

Examples

>>> from mxnet.gluon.data.vision import transforms
>>> # Transforms for Normalization
>>> input_transform = transforms.Compose([
>>>     transforms.ToTensor(),
>>>     transforms.Normalize([.485, .456, .406], [.229, .224, .225]),
>>> ])
>>> # Create Dataset
>>> trainset = gluoncv.data.VOCSegmentation(split='train', transform=input_transform)
>>> # Create Training Loader
>>> train_data = gluon.data.DataLoader(
>>>     trainset, 4, shuffle=True, last_batch='rollover',
>>>     num_workers=4)
class gluoncv.data.VOCAugSegmentation(root='/var/lib/jenkins/.mxnet/datasets/voc', split='train', mode=None, transform=None)[source]

Pascal VOC Augmented Semantic Segmentation Dataset.

Parameters:
  • root (string) – Path to VOCdevkit folder. Default is ‘$(HOME)/mxnet/datasplits/voc’
  • split (string) – ‘train’ or ‘val’
  • transform (callable, optional) – A function that transforms the image

Examples

>>> from mxnet.gluon.data.vision import transforms
>>> # Transforms for Normalization
>>> input_transform = transforms.Compose([
>>>     transforms.ToTensor(),
>>>     transforms.Normalize([.485, .456, .406], [.229, .224, .225]),
>>> ])
>>> # Create Dataset
>>> trainset = gluoncv.data.VOCAugSegmentation(split='train', transform=input_transform)
>>> # Create Training Loader
>>> train_data = gluon.data.DataLoader(
>>>     trainset, 4, shuffle=True, last_batch='rollover',
>>>     num_workers=4)
class gluoncv.data.COCODetection(root='~/.mxnet/datasets/coco', splits=('instances_val2017', ), transform=None, min_object_area=0, skip_empty=True)[source]

MS COCO detection dataset.

Parameters:
  • root (str, default '~/mxnet/datasets/voc') – Path to folder storing the dataset.
  • splits (list of str, default ['instances_val2017']) – Json annotations name. Candidates can be: instances_val2017, instances_train2017.
  • transform (callable, defaut None) –

    A function that takes data and label and transforms them. Refer to ./transforms for examples.

    A transform function for object detection should take label into consideration, because any geometric modification will require label to be modified.

  • min_object_area (float) – Minimum accepted ground-truth area, if an object’s area is smaller than this value, it will be ignored.
  • skip_empty (bool, default is True) – Whether skip images with no valid object. This should be True in training, otherwise it will cause undefined behavior.
class gluoncv.data.ADE20KSegmentation(root='/var/lib/jenkins/.mxnet/datasets/ade', split='train', mode=None, transform=None)[source]

ADE20K Semantic Segmentation Dataset.

Parameters:
  • root (string) – Path to VOCdevkit folder. Default is ‘$(HOME)/mxnet/datasplits/voc’
  • split (string) – ‘train’, ‘val’ or ‘test’
  • transform (callable, optional) – A function that transforms the image

Examples

>>> from mxnet.gluon.data.vision import transforms
>>> # Transforms for Normalization
>>> input_transform = transforms.Compose([
>>>     transforms.ToTensor(),
>>>     transforms.Normalize([.485, .456, .406], [.229, .224, .225]),
>>> ])
>>> # Create Dataset
>>> trainset = gluoncv.data.ADE20KSegmentation(split='train', transform=input_transform)
>>> # Create Training Loader
>>> train_data = gluon.data.DataLoader(
>>>     trainset, 4, shuffle=True, last_batch='rollover',
>>>     num_workers=4)
class gluoncv.data.DetectionDataLoader(dataset, batch_size=None, shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0)[source]

Data loader for detection dataset.

Deprecated since version 0.2.0: DetectionDataLoader is deprecated, please use mxnet.gluon.data.DataLoader with batchify functions listed in gluoncv.data.batchify directly.

It loads data batches from a dataset and then apply data transformations. It’s a subclass of mxnet.gluon.data.DataLoader, and therefore has very simliar APIs.

The main purpose of the DataLoader is to pad variable length of labels from each image, because they have different amount of objects.

Parameters:
  • dataset (mxnet.gluon.data.Dataset or numpy.ndarray or mxnet.ndarray.NDArray) – The source dataset.
  • batch_size (int) – The size of mini-batch.
  • shuffle (bool, default False) – If or not randomly shuffle the samples. Often use True for training dataset and False for validation/test datasets
  • sampler (mxnet.gluon.data.Sampler, default None) – The sampler to use. We should either specify a sampler or enable shuffle, not both, because random shuffling is a sampling method.
  • last_batch ({'keep', 'discard', 'rollover'}, default is keep) –

    How to handle the last batch if the batch size does not evenly divide by the number of examples in the dataset. There are three options to deal with the last batch if its size is smaller than the specified batch size.

    • keep: keep it
    • discard: throw it away
    • rollover: insert the examples to the beginning of the next batch
  • batch_sampler (mxnet.gluon.data.BatchSampler) – A sampler that returns mini-batches. Do not specify batch_size, shuffle, sampler, and last_batch if batch_sampler is specified.
  • batchify_fn (callable) –

    Callback function to allow users to specify how to merge samples into a batch. Defaults to gluoncv.data.dataloader.default_pad_batchify_fn():

    def default_pad_batchify_fn(data):
        if isinstance(data[0], nd.NDArray):
            return nd.stack(*data)
        elif isinstance(data[0], tuple):
            data = zip(*data)
            return [pad_batchify(i) for i in data]
        else:
            data = np.asarray(data)
            pad = max([l.shape[0] for l in data])
            buf = np.full((len(data), pad, data[0].shape[-1]),
                          -1, dtype=data[0].dtype)
            for i, l in enumerate(data):
                buf[i][:l.shape[0], :] = l
            return nd.array(buf, dtype=data[0].dtype)
    
  • num_workers (int, default 0) – The number of multiprocessing workers to use for data preprocessing. If num_workers = 0, multiprocessing is disabled. Otherwise num_workers multiprocessing worker is used to process data.