SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 501550 of 817 papers

TitleStatusHype
Temporally smooth online action detection using cycle-consistent future anticipationCode0
Improvement of Noise-Robust Single-Channel Voice Activity Detection with Spatial Pre-processing0
The Use of Video Captioning for Fostering Physical Activity0
The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods0
Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning0
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories0
Sparse Activity Discovery in Energy Constrained Multi-Cluster IoT Networks Using Group Testing0
Early Detection of In-Memory Malicious Activity based on Run-time Environmental Features0
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation0
Unified Graph Structured Models for Video Understanding0
USTC-NELSLIP System Description for DIHARD-III Challenge0
Iterative Reweighted Algorithms for Joint User Identification and Channel Estimation in Spatially Correlated Massive MTC0
Time and Frequency Network for Human Action Detection in Videos0
An Ultra-low Power RNN Classifier for Always-On Voice Wake-Up Detection Robust to Real-World Scenarios0
Incorporating VAD into ASR System by Multi-task Learning0
Coarse-Fine Networks for Temporal Activity Detection in VideosCode0
ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregationCode0
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend0
Supporting More Active Users for Massive Access via Data-assisted Activity Detection0
On training targets for noise-robust voice activity detection0
An Efficient Active Set Algorithm for Covariance Based Joint Data and Activity Detection for Massive Random Access with Massive MIMO0
Anomalous Event Recognition in Videos Based on Joint Learningof Motion and Appearance with Multiple Ranking Measures0
Quantum Learning Based Nonrandom Superimposed Coding for Secure Wireless Access in 5G URLLC0
Discovering Multi-Label Actor-Action Association in a Weakly Supervised SettingCode0
Bridging the gap between Human Action Recognition and Online Action Detection0
Hierarchical Graph-RNNs for Action Detection of Multiple Activities0
Activity Recognition with Moving Cameras and Few Training Examples: Applications for Detection of Autism-Related Headbanging0
Joint User Activity and Data Detection in Grant-Free NOMA using Generative Neural Networks0
Smart Black Box 2.0: Efficient High-bandwidth Driving Data Collection based on Video Anomalies0
Watch Only Once: An End-to-End Video Action Detection Framework0
Towards Improving Spatiotemporal Action Recognition in VideosCode0
Spatial-Temporal Alignment Network for Action Recognition and Detection0
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection0
VOXLINGUA107: A DATASET FOR SPOKEN LANGUAGE RECOGNITION0
Nudge: Accelerating Overdue Pull Requests Towards Completion0
Temporal Action Detection with Multi-level Supervision0
We don't Need Thousand Proposals Single Shot Actor-Action Detection in VideosCode0
Privileged Knowledge Distillation for Online Action Detection0
A Time-Frequency based Suspicious Activity Detection for Anti-Money Laundering0
LAP-Net: Adaptive Features Sampling via Learning Action Progression for Online Action Detection0
SALAD: Self-Assessment Learning for Action Detection0
Toyota Smarthome Untrimmed: Real-World Untrimmed Videos for Activity DetectionCode0
Activity Detection And Modeling Using Smart Meter Data: Concept And Case Studies0
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection0
Multi-Channel Speaker Verification for Single and Multi-talker Speech0
Speech enhancement aided end-to-end multi-task learning for voice activity detection0
Combination of Deep Speaker Embeddings for Diarisation0
The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge0
An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems0
Robust Two-Stream Multi-Feature Network for Driver Drowsiness Detection0
Show:102550
← PrevPage 11 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified