Apache MXNet Tutorials¶
Interested in getting started in a new CV area? Here are some tutorials to help get started.
Note: For image classification or object detection tasks, feel free to explore tutorials in AutoGluon MultiModalPredictor with better support in PyTorch.
Image Classification¶
Basics on how to use pretrained models on CIFAR10 and apply to real images
Hands on classification model training on CIFAR10
Basics on how to use pretrained models on ImageNet and apply to real images
Train on your own dataset with ImageNet pre-trained models.
Hands on classification model training on ImageNet
Object Detection¶
Detect objects in real-world images with pre-trained SSD models
Hands on SSD model training on Pascal VOC Dataset
Training tips to boost your SSD Model performance.
Detect objects in real-world images with pre-trained Faster R-CNN models
End-to-end Faster R-CNN Training on Pascal VOC
Detect objects in real-world images with pre-trained YOLO models
Hands on YOLOv3 model training on Pascal VOC Dataset
Finetune a pre-trained model on your own dataset.
Run an object detection model from your webcam.
Instance Segmentation¶
Perform instance segmentation on real-world images with pre-trained Mask R-CNN models
Hands on Mask R-CNN model training on MS COCO dataset
Semantic Segmentation¶
Perform semantic segmentation on real-world images with pre-trained FCN models
Hands on FCN model training on Pascal VOC dataset
Perform semantic segmentation in real-world images with pre-trained PSPNet models
Hands on Mask R-CNN model training on ADE20K dataset
Perform instance segmentation in real-world images with pre-trained DeepLabV3 models
Hands on DeepLabV3 model training on Pascal VOC dataset, and achieves state-of-the-art accuracy.
Perform semantic segmentation on real-world images with pre-trained ICNet models
Pose Estimation¶
Estimate human pose in real-world images with pre-trained Simple Pose models
Estimate human pose in real-world images with pre-trained AlphaPose models
Estimate human pose from your webcam video stream
Train a pose estimation model on the COCO dataset
Action Recognition¶
Recognize human actions in real-world videos with pre-trained TSN models
Hands on TSN action recognition model training on UCF101 dataset
Recognize human actions in real-world videos with pre-trained I3D models
Hands on I3D action recognition model training on Kinetics400 dataset
Recognize human actions in real-world videos with pre-trained SlowFast models
Hands on SlowFast action recognition model training on Kinetics400 dataset
Hands on SOTA video models fine-tuning on your own dataset
Extracting video features from pre-trained models on your own videos
Inference on your own videos using pre-trained models and save the predictions.
An efficient and flexible video reader for training deep video neural networks.
Object Tracking¶
Estimate Single Object Tracking in real-world video with pre-trained Object Tracking models.
SiamRPN training on VID、DET、COCO、Youtube_bb and test on Otb2015
Perform Multi-Object Tracking in real-world video with pre-trained SMOT models.
Depth Prediction¶
Predict depth from a single image using Monodepth2.
Predict depth from an image sequence or a video using Monodepth2.
Monodepth2 training on KITTI dataset.
Monodepth2 PoseNet testing on KITTI dataset.
Dataset Preparation¶
Auto Module¶
Distributed Training¶
Hands on distributed training of SlowFast models on Kinetics400 dataset.