SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 551600 of 817 papers

TitleStatusHype
The "Sound of Silence" in EEG -- Cognitive voice activity detection0
Online Action Detection in Streaming Videos with Time Buffers0
A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments0
Grant-Free Access via Bilinear Inference for Cell-Free MIMO with Low-Coherent Pilots0
Learning Visual Voice Activity Detection with an Automatically Annotated Dataset0
TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval0
On Multitask Loss Function for Audio Event Detection and Localization0
Massive Machine Type Communication Pilot-Hopping Sequence Detection Architectures Based on Non-Negative Least Squares for Grant-Free Random Access0
Online Spatiotemporal Action Detection and Prediction via Causal RepresentationsCode0
Finding Action Tubes with a Sparse-to-Dense Framework0
RespVAD: Voice Activity Detection via Video-Extracted Respiration PatternsCode0
SegCodeNet: Color-Coded Segmentation Masks for Activity Detection from Wearable Cameras0
CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization0
MLNET: An Adaptive Multiple Receptive-field Attention Neural Network for Voice Activity Detection0
A Multi-Task Learning Approach for Human Activity Segmentation and Ergonomics Risk AssessmentCode0
Multi-Level Temporal Pyramid Network for Action Detection0
Jointly Sparse Signal Recovery and Support Recovery via Deep Learning with Applications in MIMO-based Grant-Free Random Access0
"This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II)0
Boundary Content Graph Neural Network for Temporal Action Proposal Generation0
Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition0
Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments0
Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos0
The AFRL IWSLT 2020 Systems: Work-From-Home Edition0
Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations0
The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge0
ESAD: Endoscopic Surgeon Action Detection Dataset0
Distributed Optimization for Massive Connectivity0
WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos0
Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features0
Real-Time Radar-Based Gesture Detection and Recognition Built in an Edge-Computing Platform0
Siamese Neural Networks for Class Activity Detection0
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario0
Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention0
Spatio-Temporal Event Segmentation and Localization for Wildlife Extended Videos0
The SAFE-T Corpus: A New Resource for Simulated Public Safety Communications0
Semi-supervised Acoustic Modelling for Five-lingual Code-switched ASR using Automatically-segmented Soap Opera Speech0
Activity Detection from Wearable Electromyogram Sensors using Hidden Markov Model0
Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos0
TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition0
Group Activity Detection from Trajectory and Video Data in Soccer0
ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos0
Semi-supervised acoustic modelling for five-lingual code-switched ASR using automatically-segmented soap opera speech0
Progressive Boundary Refinement Network for Temporal Action Detection0
Two-Stream AMTnet for Action DetectionCode0
Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection0
Spatio-Temporal Action Detection with Multi-Object Interaction0
Revisiting Few-shot Activity Detection with Class Similarity Control0
Long Short-Term Relation Networks for Video Action Detection0
Dual Attention in Time and Frequency Domain for Voice Activity DetectionCode0
Rethinking Online Action Detection in Untrimmed Videos: A Novel Online Evaluation ProtocolCode0
Show:102550
← PrevPage 12 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified