SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 601650 of 817 papers

TitleStatusHype
Object Aware Egocentric Online Action Detection0
OFDM-Based Massive Connectivity for LEO Satellite Internet of Things0
Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection0
One-Shot Action Localization by Learning Sequence Matching Network0
One-stage Action Detection Transformer0
Online Action Detection0
Online Action Detection in Streaming Videos with Time Buffers0
Online Anomaly Detection via Class-Imbalance Learning0
Online Detection of Action Start in Untrimmed, Streaming Videos0
Online Target Speaker Voice Activity Detection for Speaker Diarization0
On Multitask Loss Function for Audio Event Detection and Localization0
On the Detection of Non-Cooperative RISs: Scan B-Testing via Deep Support Vector Data Description0
On training targets for noise-robust voice activity detection0
On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches0
Open Set Action Recognition via Multi-Label Evidential Learning0
Open-Vocabulary Spatio-Temporal Action Detection0
Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features0
Overcomplete Frame Thresholding for Acoustic Scene Analysis0
PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos0
Parallel Neurosymbolic Integration with Concordia0
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection0
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition0
PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding0
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities0
Polish Read Speech Corpus for Speech Tools and Services0
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization0
PEAF: Learnable Power Efficient Analog Acoustic Features for Audio Recognition0
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System0
Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access0
Predicting Action Tubes0
Prediction-Feedback DETR for Temporal Action Detection0
Predictive-Corrective Networks for Action Detection0
Privileged Knowledge Distillation for Online Action Detection0
Progressive Boundary Refinement Network for Temporal Action Detection0
Progressively Parsing Interactional Objects for Fine Grained Action Detection0
Prompt-driven Target Speech Diarization0
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation0
Proximal Gradient-Based Unfolding for Massive Random Access in IoT Networks0
Quantum Learning Based Nonrandom Superimposed Coding for Secure Wireless Access in 5G URLLC0
Query matching for spatio-temporal action detection with query-based object detector0
RADNet: A Deep Neural Network Model for Robust Perception in Moving Autonomous Systems0
Raising the Bar(ometer): Identifying a User's Stair and Lift Usage Through Wearable Sensor Data Analysis0
Random Access with Massive MIMO-OTFS in LEO Satellite Communications0
RCL: Recurrent Continuous Localization for Temporal Action Detection0
R-CNNs for Pose Estimation and Action Detection0
Real-Time End-to-End Action Detection with Two-Stream Networks0
Real-time Online Action Detection Forests using Spatio-temporal Contexts0
Real-Time Radar-Based Gesture Detection and Recognition Built in an Edge-Computing Platform0
Recurrent Convolutions for Causal 3D CNNs0
Recurrent Tubelet Proposal and Recognition Networks for Action Detection0
Show:102550
← PrevPage 13 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified