Table Of Contents
Table Of Contents

Action Recognition

Table of pre-trained models for video action recognition and their performance.

Hint

Training commands work with this script: Download train_recognizer.py

A model can have differently trained parameters with different hashtags. Parameters with a grey name can be downloaded by passing the corresponding hashtag.

  • Download default pretrained weights: net = get_model('inceptionv3_ucf101', pretrained=True)

  • Download weights given a hashtag: net = get_model('inceptionv3_ucf101', pretrained='0c453da8')

The test script Download test_recognizer.py can be used for evaluating the models.

UCF101 Dataset

The following table lists pre-trained models trained on UCF101.

Note

Our pre-trained models reproduce results from “Temporal Segment Networks” 2 . Please check the reference paper for further information.

The top-1 accuracy number shown below is for official split 1 of UCF101 dataset, not the average of 3 splits.

InceptionV3 is trained and evaluated with input size of 299x299.

Name

Top-1

Hashtag

Train Command

Train Log

vgg16_ucf101 2

83.4

d6dc1bba

shell script

log

vgg16_ucf101 1

81.5

05e319d4

shell script

log

inceptionv3_ucf101 2

88.1

13ef5c3b

shell script

log

inceptionv3_ucf101 1

85.6

0c453da8

shell script

log

Kinetics400 Dataset

The following table lists pre-trained models trained on Kinetics400.

Note

Our pre-trained models reproduce results from “Temporal Segment Networks” 2 . Please check the reference paper for further information.

InceptionV3 is trained and evaluated with input size of 299x299.

Name

Top-1

Hashtag

Train Command

Train Log

inceptionv3_kinetics400 2

72.5

8a4a6946

shell script

log

1(1,2)

Limin Wang, Yuanjun Xiong, Zhe Wang, and Yu Qiao. “Towards Good Practices for Very Deep Two-Stream ConvNets.” arXiv preprint arXiv:1507.02159 (2015).

2(1,2,3,4,5)

Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang and Luc Van Gool. “Temporal Segment Networks: Towards Good Practices for Deep Action Recognition.” In European Conference on Computer Vision (ECCV). 2016.