SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 301350 of 817 papers

TitleStatusHype
An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos0
Cascaded Boundary Regression for Temporal Action Detection0
Group Event Detection with a Varying Number of Group Members for Video Surveillance0
Group Activity Detection from Trajectory and Video Data in Soccer0
CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors0
Joint Estimation of Clustered User Activity and Correlated Channels with Unknown Covariance in mMTC0
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture0
JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection0
Building Accurate Low Latency ASR for Streaming Voice Search0
Grant-free Massive Random Access with Retransmission: Receiver Optimization and Performance Analysis0
Grant-Free Access via Bilinear Inference for Cell-Free MIMO with Low-Coherent Pilots0
ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing0
GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge0
Budget-Aware Deep Semantic Video Segmentation0
Budget-Aware Activity Detection with A Recurrent Policy Network0
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection0
C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing0
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection0
Joint Activity Detection and Channel Estimation for Clustered Massive Machine Type Communications0
GateHUB: Gated History Unit with Background Suppression for Online Action Detection0
GTTS-EHU Systems for QUESST at MediaEval 20140
An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems0
Game State and Spatio-temporal Action Detection in Soccer using Graph Neural Networks and 3D Convolutional Networks0
Hand Action Detection from Ego-centric Depth Sequences with Error-correcting Hough Transform0
Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos0
Joint Activity-Delay Detection and Channel Estimation for Asynchronous Massive Random Access0
Bridging the gap between Human Action Recognition and Online Action Detection0
Frequency domain TRINICON-based blind source separation method with multi-source activity detection for sparsely mixed signals0
Hierarchical Graph-RNNs for Action Detection of Multiple Activities0
An Efficient Active Set Algorithm for Covariance Based Joint Data and Activity Detection for Massive Random Access with Massive MIMO0
Hierarchical MTC User Activity Detection and Channel Estimation with Unknown Spatial Covariance0
Hierarchical Self-Attention Network for Action Localization in Videos0
High-speed Low-consumption sEMG-based Transient-state micro-Gesture Recognition0
Fotheidil: an Automatic Transcription System for the Irish Language0
Follow the Attention: Combining Partial Pose and Object Motion for Fine-Grained Action Detection0
Human Attention Detection Using AM-FM Representations0
Boundary-Recovering Network for Temporal Action Detection0
Hybrid Active Learning via Deep Clustering for Video Action Detection0
Self-supervised New Activity Detection in Sensor-based Smart Environments0
Identity-aware Graph Memory Network for Action Detection0
Improvement of Noise-Robust Single-Channel Voice Activity Detection with Spatial Pre-processing0
Improve Temporal Action Proposals using Hierarchical Context0
Improving Action Localization by Progressive Cross-stream Cooperation0
Improving endpoint detection in end-to-end streaming ASR for conversational speech0
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition0
Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications0
Accelerating temporal action proposal generation via high performance computing0
Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads0
An Empirical Study on Activity Recognition in Long Surgical Videos0
Joint Activity-Delay Detection and Channel Estimation for Asynchronous Massive Random Access: A Free Probability Theory Approach0
Show:102550
← PrevPage 7 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10I3D + biGRU + VS-ST-MPNNmAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified