SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 651700 of 817 papers

TitleStatusHype
Identifying Visible Actions in Lifestyle VlogsCode0
rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection MethodCode0
Two-Stream Region Convolutional 3D Network for Temporal Activity Detection0
Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model0
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection0
Improving Action Localization by Progressive Cross-stream Cooperation0
Representation Learning on Visual-Symbolic Graphs for Video Understanding0
Follow the Attention: Combining Partial Pose and Object Motion for Fine-Grained Action Detection0
Spatio-Temporal Action Localization in a Weakly Supervised Setting0
A Study on Action Detection in the Wild0
Simple yet efficient real-time pose-based action recognitionCode0
STEP: Spatio-Temporal Progressive Learning for Video Action DetectionCode0
Weakly Supervised Gaussian Networks for Action Detection0
Decoupling Localization and Classification in Single Shot Temporal Action DetectionCode0
Dance with Flow: Two-in-One Stream Action DetectionCode0
Emotion Action Detection and Emotion Inference: the Task and DatasetCode0
COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis0
Towards Segmenting Anything That MovesCode0
An Ensemble SVM-based Approach for Voice Activity Detection0
Spatio-temporal Action Recognition: A Survey0
Actor Conditioned Attention Maps for Video Action DetectionCode0
Similarity R-C3D for Few-shot Temporal Activity Detection0
A Structured Model For Action Detection0
Tri-axial Self-Attention for Concurrent Activity Recognition0
Computational Graph Approach for Detection of Composite Human Activities0
Structure-Aware Convolutional Neural NetworksCode0
Discovering Spatio-Temporal Action Tubes0
Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action DetectionCode0
A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos0
Segregated Temporal Assembly Recurrent Networks for Weakly Supervised Multiple Action Detection0
Temporal Recurrent Networks for Online Action DetectionCode0
Recurrent Convolutions for Causal 3D CNNs0
BLP -- Boundary Likelihood Pinpointing Networks for Accurate Temporal Action Localization0
Temporal Action Detection by Joint Identification-Verification0
Sequence Block based Compressed Sensing Multiuser Detection for 5G0
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models0
Recurrent Tubelet Proposal and Recognition Networks for Action Detection0
AAD: Adaptive Anomaly Detection through traffic surveillance videos0
Predicting Action Tubes0
Dynamic Temporal Pyramid Network: A Closer Look at Multi-Scale Modeling for Activity Detection0
DFTerNet: Towards 2-bit Dynamic Fusion Networks for Accurate Human Activity Recognition0
Action Detection from a Robot-Car Perspective0
Actor-Centric Relation Network0
S3D: Single Shot multi-Span Detector via Fully 3D Convolutional NetworksCode0
Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection0
Step-by-step Erasion, One-by-one Collection: A Weakly Supervised Temporal Action Detector0
A deep learning approach for understanding natural language commands for mobile service robots0
Neural Dialogue Context Online End-of-Turn Detection0
A flexible model for training action localization with varying levels of supervisionCode0
Modality Distillation with Multiple Stream Networks for Action RecognitionCode0
Show:102550
← PrevPage 14 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified