SOTAVerified

Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Showing 251300 of 817 papers

TitleStatusHype
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos with Procedural Texts0
EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III0
EMO\&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.0
A Hybrid Graph Network for Complex Activity Detection in Video0
ContextDet: Temporal Action Detection with Adaptive Context Aggregation0
A Customer Level Fraudulent Activity Detection Benchmark for Enhancing Machine Learning Model Research and Evaluation0
Context-aware Proposal Network for Temporal Action Detection0
Computer-Aided Automated Detection of Gene-Controlled Social Actions of Drosophila0
An Ultra-low Power RNN Classifier for Always-On Voice Wake-Up Detection Robust to Real-World Scenarios0
Computational Graph Approach for Detection of Composite Human Activities0
A Novel Online Action Detection Framework from Untrimmed Video Streams0
Action Detection from a Robot-Car Perspective0
Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation0
Compositional Structure Learning for Action Understanding0
A Novel Approach for Robust Multi Human Action Recognition and Summarization based on 3D Convolutional Neural Networks0
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness0
Comparative Analysis of Deep Learning Approaches for Harmful Brain Activity Detection Using EEG0
A Novel Approach for Human Action Recognition from Silhouette Images0
Combination of Deep Speaker Embeddings for Diarisation0
A Nonparametric Model for Multimodal Collaborative Activities Summarization0
Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements0
Anomalous Sound Detection Based on Machine Activity Detection0
Action Detection by Implicit Intentional Motion Clustering0
Accelerating Coordinate Descent via Active Set Selection for Device Activity Detection for Multi-Cell Massive Random Access0
Hierarchical Graph-RNNs for Action Detection of Multiple Activities0
COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis0
Anomalous Event Recognition in Videos Based on Joint Learningof Motion and Appearance with Multiple Ranking Measures0
A Boosting Algorithm for Positive-Unlabeled Learning0
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection0
Class Semantics-based Attention for Action Detection0
AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming0
3rd party observer gaze as a continuous measure of dialogue flow0
Classification Matters: Improving Video Action Detection with Class-Specific Attention0
Self-supervised New Activity Detection in Sensor-based Smart Environments0
A new network-based algorithm for human activity recognition in video0
An Ensemble SVM-based Approach for Voice Activity Detection0
ACT-Net: Anchor-context Action Detection in Surgery Videos0
Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices0
Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection0
CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization0
An enhanced system for the detection and active cancellation of snoring signals0
An end-to-end (deep) neural network applied to raw EEG, fNIRs and body motion data for data fusion and BCI classification task without any pre-/post-processing0
Activity Recognition with Moving Cameras and Few Training Examples: Applications for Detection of Autism-Related Headbanging0
CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment0
Cascaded Boundary Regression for Temporal Action Detection0
CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors0
An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos0
Hand Action Detection from Ego-centric Depth Sequences with Error-correcting Hough Transform0
C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing0
Building Accurate Low Latency ASR for Streaming Voice Search0
Show:102550
← PrevPage 6 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1STAR/LFrame-mAP 0.590.3Unverified
2SiAFrame-mAP 0.588.5Unverified
3YOWO + LFBFrame-mAP 0.587.3Unverified
4HITFrame-mAP 0.584.8Unverified
5HISAN (ResNet-101 + FPN)Video-mAP 0.282.3Unverified
6YOWOFrame-mAP 0.580.4Unverified
7Two-in-one Two StreamVideo-mAP 0.278.48Unverified
8MOCFrame-mAP 0.577.8Unverified
9Faster-RCNN + two-stream I3D convFrame-mAP 0.576.3Unverified
10Two-in-oneVideo-mAP 0.275.48Unverified
#ModelMetricClaimedVerifiedStatus
1SiAFrame-mAP 0.588.5Unverified
2HISAN (ResNet-101 + FPN)Video-mAP 0.287.59Unverified
3HITFrame-mAP 0.583.8Unverified
4HISAN (VGG-16)Frame-mAP 0.576.72Unverified
5DTSVideo-mAP 0.276.1Unverified
6YOWO + LFBFrame-mAP 0.575.7Unverified
7Two-in-one Two StreamVideo-mAP 0.574.74Unverified
8YOWOFrame-mAP 0.574.4Unverified
9MOCFrame-mAP 0.574Unverified
10Faster-RCNN + two-stream I3D convFrame-mAP 0.573.3Unverified
#ModelMetricClaimedVerifiedStatus
1TTMmAP28.79Unverified
2CTRNmAP27.8Unverified
3Coarse-Fine Networks (w/ self-supervised detection pretraining)mAP26.95Unverified
4UniMD+Sync. (RGB+Flow)mAP26.53Unverified
5PDAN (RGB+Flow)mAP26.5Unverified
6PATmAP26.5Unverified
7MS-TCT (RGB only)mAP25.4Unverified
83D ResNet-50 + super-events pretrained on AViDmAP25.2Unverified
9Coarse-Fine NetworksmAP25.1Unverified
10MLAD (RGB + Flow)mAP23.7Unverified
#ModelMetricClaimedVerifiedStatus
1MLADmAP51.5Unverified
2CTRNmAP51.2Unverified
3PDANmAP47.6Unverified
4TGMmAP46.4Unverified
5MS-TCT (RGB only)mAP43.1Unverified
6I3D + our super-eventmAP36.4Unverified
7Two-stream + LSTMmAP28.1Unverified
8Two-streammAP27.6Unverified
#ModelMetricClaimedVerifiedStatus
1Two-in-one Two StreamVideo-mAP 0.596.52Unverified
2DTSVideo-mAP 0.294.3Unverified
3Two-in-oneVideo-mAP 0.592.74Unverified
4T-CNNFrame-mAP 0.586.7Unverified
5MR-TS R-CNNFrame-mAP 0.584.52Unverified
6TS R-CNNFrame-mAP 0.582.3Unverified
7Action TubesFrame-mAP 0.568.1Unverified
#ModelMetricClaimedVerifiedStatus
1MAT (Ours) TransmAP71.6Unverified
2TadML-two streammAP59.7Unverified
3MAT (ours)mAP58.2Unverified
4TadML-rgbmAP53.46Unverified
#ModelMetricClaimedVerifiedStatus
1HITFrame-mAP 0.533.3Unverified
2SiAFrame-mAP 0.528.8Unverified
#ModelMetricClaimedVerifiedStatus
1MS-TCTFrame-mAP33.7Unverified
2PDANFrame-mAP32.7Unverified
#ModelMetricClaimedVerifiedStatus
1STCNNIoU0.14Unverified
2Two Stream NetworkIoU0.07Unverified
#ModelMetricClaimedVerifiedStatus
1STCNN-V2 (Vote decision)IoU0.52Unverified
2RGB and PRGBIoU0.35Unverified
#ModelMetricClaimedVerifiedStatus
1PATmAP44.6Unverified