SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 151200 of 817 papers

TitleStatusHype
ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos0
Discovering Spatio-Temporal Action Tubes0
A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments0
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation0
A Grammatical Compositional Model for Video Action Detection0
Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion0
Aggressive actions and anger detection from multiple modalities using Kinect0
A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities0
Distributed Activity Detection for Cell-Free Hybrid Near-Far Field Communications0
Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures0
ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding0
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection0
A Time-Frequency based Suspicious Activity Detection for Anti-Money Laundering0
A Temporal Simulator for Developing Turn-Taking Methods for Spoken Dialogue Systems0
Asynchronous Random Access in Massive MIMO Systems Facilitated by the Delay-Angle Domain0
A Flexible Framework for Grant-Free Random Access in Cell-Free Massive MIMO Systems0
A Survey on Recent Advances of Computer Vision Algorithms for Egocentric Video0
A Survey on Deep Learning-based Spatio-temporal Action Detection0
A Circular Window-based Cascade Transformer for Online Action Detection0
Distributed Optimization for Massive Connectivity0
A Study on Action Detection in the Wild0
A Structured Model For Action Detection0
Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization0
whu-nercms at trecvid2021:instance search task0
Actionness Estimation Using Hybrid Fully Convolutional Networks0
Device Activity Detection and Channel Estimation for Millimeter-Wave Massive MIMO0
A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions0
Actionness-Assisted Recognition of Actions0
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection0
Advanced Image Segmentation Techniques for Neural Activity Detection via C-fos Immediate Early Gene Expression0
AAD: Adaptive Anomaly Detection through traffic surveillance videos0
A Proposed Artificial intelligence Model for Real-Time Human Action Localization and Tracking0
Device Detection and Channel Estimation in MTC with Correlated Activity Pattern0
Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence0
Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals0
ADM-Loc: Actionness Distribution Modeling for Point-supervised Temporal Action Localization0
A deep learning approach for understanding natural language commands for mobile service robots0
A Real-Time Voice Activity Detection Based On Lightweight Neural0
Action Detection via an Image Diffusion Process0
Deformable Tube Network for Action Detection in Videos0
A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos0
ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection0
Continual Low-Rank Scaled Dot-product Attention0
Context Understanding in Computer Vision: A Survey0
A processing framework to access large quantities of whispered speech found in ASMR0
Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection0
Continuous Human Action Detection Based on Wearable Inertial Data0
Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection0
Cooperative Multi-Cell Massive Access with Temporally Correlated Activity0
Application of Machine Learning Techniques in Human Activity Recognition0
Show:102550
← PrevPage 4 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10I3D + biGRU + VS-ST-MPNNmAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified