SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 701750 of 817 papers

TitleStatusHype
SoccerNet: A Scalable Dataset for Action Spotting in Soccer VideosCode1
Fine-grained Activity Recognition in Baseball VideosCode0
Jointly Detecting and Separating Singing Voice: A Multi-Task Approach0
Learning to Anonymize Faces for Privacy Preserving Action DetectionCode0
C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing0
Temporal Gaussian Mixture Layer for VideosCode0
Frequency domain TRINICON-based blind source separation method with multi-source activity detection for sparsely mixed signals0
Real-Time End-to-End Action Detection with Two-Stream Networks0
Spatial Morphing Kernel Regression For Feature Interpolation0
Online Detection of Action Start in Untrimmed, Streaming Videos0
Structured Label Inference for Visual UnderstandingCode0
A Convolutional Neural Network Smartphone App for Real-Time Voice Activity DetectionCode0
Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection0
Recursive Binary Neural Network Learning Model with 2-bit/weight Storage Requirement0
Overcomplete Frame Thresholding for Acoustic Scene Analysis0
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video ClassificationCode0
Learning Latent Super-Events to Detect Multiple Activities in VideosCode0
Graph Distillation for Action Detection with Privileged ModalitiesCode0
An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos0
Budget-Aware Activity Detection with A Recurrent Policy Network0
Single Shot Temporal Action DetectionCode0
Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNNCode0
Joint Learning of Object and Action Detectors0
TORNADO: A Spatio-Temporal Convolutional Regression Network for Video Action Proposal0
Protest Activity Detection and Perceived Violence Estimation from Social Media ImagesCode0
A Nonparametric Model for Multimodal Collaborative Activities Summarization0
A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions0
Extensible Hierarchical Method of Detecting Interactive Actions for Video Understanding0
EUDAMU at SemEval-2017 Task 11: Action Ranking and Type Matching for End-User Development0
Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation0
SST: Single-Stream Temporal Action ProposalsCode0
SCC: Semantic Context Cascade for Efficient Action Detection0
Budget-Aware Deep Semantic Video Segmentation0
A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement LearningCode0
Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection0
Action Sets: Weakly Supervised Action Segmentation without Ordering ConstraintsCode0
Polish Read Speech Corpus for Speech Tools and Services0
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual ActionsCode1
Am I Done? Predicting Action Progress in VideosCode0
Cascaded Boundary Regression for Temporal Action Detection0
Skeleton-based Action Recognition with Convolutional Neural NetworksCode1
Temporal Action Detection with Structured Segment NetworksCode2
Skeleton Boxes: Solving skeleton based action detection with a single deep convolutional neural network0
AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture0
Temporal Action Localization by Structured Maximal Sums0
Predictive-Corrective Networks for Action Detection0
Incremental Tube Construction for Human Action DetectionCode0
Unsupervised Action Proposal Ranking through Proposal Recombination0
Tube Convolutional Neural Network (T-CNN) for Action Detection in VideosCode0
PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding0
Show:102550
← PrevPage 15 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified