Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 817 papers

Title	Date	Tasks	Status
A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments	Oct 6, 2020	Action DetectionActivity Detection	—Unverified
Automated speech tools for helping communities process restricted-access corpora for language revival efforts	Apr 15, 2022	Action DetectionActivity Detection	—Unverified
Automatic Speech Recognition for Hindi	Jun 26, 2024	Action DetectionActivity Detection	—Unverified
BC-VAD: A Robust Bone Conduction Voice Activity Detection	Dec 6, 2022	Action DetectionActivity Detection	—Unverified
Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV	Mar 23, 2023	Action DetectionActivity Detection	—Unverified
Beyond Pixels: Leveraging the Language of Soccer to Improve Spatio-Temporal Action Detection in Broadcast Videos	May 14, 2025	Action DetectionDecoder	—Unverified
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories	Apr 2, 2021	Action DetectionAction Recognition	—Unverified
Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation	Apr 23, 2021	Action DetectionActivity Detection	—Unverified
Binary Image Skeletonization Using 2-Stage U-Net	Dec 22, 2021	Action DetectionActivity Detection	—Unverified
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024	Jun 24, 2024	Action DetectionActivity Detection	—Unverified
Blind User Activity Detection for Grant-Free Random Access in Cell-Free mMIMO Networks	Aug 5, 2024	Action DetectionActivity Detection	—Unverified
BLP -- Boundary Likelihood Pinpointing Networks for Accurate Temporal Action Localization	Nov 6, 2018	Action DetectionAction Localization	—Unverified
Bodily Behaviors in Social Interaction: Novel Annotations and State-of-the-Art Evaluation	Jul 26, 2022	Action DetectionDescriptive	—Unverified
Boundary Content Graph Neural Network for Temporal Action Proposal Generation	Aug 4, 2020	Action DetectionAction Understanding	—Unverified
Boundary-Recovering Network for Temporal Action Detection	Aug 18, 2024	Action Detectionobject-detection	—Unverified
Bridging the gap between Human Action Recognition and Online Action Detection	Jan 21, 2021	Action DetectionAction Recognition	—Unverified
Budget-Aware Activity Detection with A Recurrent Policy Network	Nov 30, 2017	Action DetectionActivity Detection	—Unverified
Budget-Aware Deep Semantic Video Segmentation	Jul 1, 2017	Action DetectionActivity Detection	—Unverified
Building Accurate Low Latency ASR for Streaming Voice Search	May 29, 2023	Action DetectionActivity Detection	—Unverified
C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing	Mar 20, 2018	Action DetectionActivity Detection	—Unverified
CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors	Mar 4, 2025	Action DetectionActivity Detection	—Unverified
Cascaded Boundary Regression for Temporal Action Detection	May 2, 2017	Action Detectionregression	—Unverified
CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment	Jun 25, 2025	Action DetectionActivity Detection	—Unverified
CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization	Aug 19, 2020	Action DetectionAction Localization	—Unverified
Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection	Feb 13, 2024	Action DetectionActivity Detection	—Unverified
Self-supervised New Activity Detection in Sensor-based Smart Environments	Jan 17, 2024	Action DetectionActivity Detection	—Unverified
Classification Matters: Improving Video Action Detection with Class-Specific Attention	Jul 29, 2024	Action DetectionClassification	—Unverified
Class Semantics-based Attention for Action Detection	Sep 6, 2021	Action DetectionAction Localization	—Unverified
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection	Oct 18, 2024	Action DetectionActivity Detection	—Unverified
COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis	Mar 7, 2019	Action Detection	—Unverified
Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements	Nov 22, 2023	Action DetectionActivity Detection	—Unverified
Combination of Deep Speaker Embeddings for Diarisation	Oct 22, 2020	Action DetectionActivity Detection	—Unverified
Comparative Analysis of Deep Learning Approaches for Harmful Brain Activity Detection Using EEG	Dec 10, 2024	Action DetectionActivity Detection	—Unverified
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness	Jun 12, 2024	Action DetectionActivity Detection	—Unverified
Compositional Structure Learning for Action Understanding	Oct 21, 2014	Action DetectionAction Understanding	—Unverified
Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation	Mar 20, 2020	Action Detection	—Unverified
Computational Graph Approach for Detection of Composite Human Activities	Dec 5, 2018	Action DetectionActivity Detection	—Unverified
Computer-Aided Automated Detection of Gene-Controlled Social Actions of Drosophila	Sep 11, 2019	Action DetectionClassification	—Unverified
Context-aware Proposal Network for Temporal Action Detection	Jun 18, 2022	Action ClassificationAction Detection	—Unverified
ContextDet: Temporal Action Detection with Adaptive Context Aggregation	Oct 20, 2024	Action DetectionVideo Understanding	—Unverified
Context-LSTM: a robust classifier for video detection on UCF101	Mar 13, 2022	Action DetectionAction Recognition	—Unverified
Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection	Jan 28, 2018	Action DetectionActivity Detection	—Unverified
Context Understanding in Computer Vision: A Survey	Feb 10, 2023	Action Detectionimage-classification	—Unverified
Continual Low-Rank Scaled Dot-product Attention	Dec 4, 2024	Action DetectionAudio Classification	—Unverified
Continuous Human Action Detection Based on Wearable Inertial Data	Dec 11, 2021	Action DetectionGesture Recognition	—Unverified
Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection	Jul 16, 2018	Action DetectionObject	—Unverified
Cooperative Multi-Cell Massive Access with Temporally Correlated Activity	Apr 19, 2023	Action DetectionActivity Detection	—Unverified
Cross-Channel Attention-Based Target Speaker Voice Activity Detection: Experimental Results for M2MeT Challenge	Feb 6, 2022	Action DetectionActivity Detection	—Unverified
Cross-domain Voice Activity Detection with Self-Supervised Representations	Sep 22, 2022	Action DetectionActivity Detection	—Unverified
Cross modal video representations for weakly supervised active speaker localization	Mar 9, 2020	Action DetectionActive Speaker Localization	—Unverified

Show:10 25 50

← PrevPage 9 of 17Next →

All datasets UCF101-24 J-HMDB Charades Multi-THUMOS UCF Sports THUMOS' 14 MultiSports TSU TTStroke-21 ME21 TTStroke-21 ME22 MultiTHUMOS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	STAR/L	Frame-mAP 0.5	90.3	—	Unverified
2	SiA	Frame-mAP 0.5	88.5	—	Unverified
3	YOWO + LFB	Frame-mAP 0.5	87.3	—	Unverified
4	HIT	Frame-mAP 0.5	84.8	—	Unverified
5	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	82.3	—	Unverified
6	YOWO	Frame-mAP 0.5	80.4	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.2	78.48	—	Unverified
8	MOC	Frame-mAP 0.5	77.8	—	Unverified
9	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	76.3	—	Unverified
10	Two-in-one	Video-mAP 0.2	75.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SiA	Frame-mAP 0.5	88.5	—	Unverified
2	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	87.59	—	Unverified
3	HIT	Frame-mAP 0.5	83.8	—	Unverified
4	HISAN (VGG-16)	Frame-mAP 0.5	76.72	—	Unverified
5	DTS	Video-mAP 0.2	76.1	—	Unverified
6	YOWO + LFB	Frame-mAP 0.5	75.7	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.5	74.74	—	Unverified
8	YOWO	Frame-mAP 0.5	74.4	—	Unverified
9	MOC	Frame-mAP 0.5	74	—	Unverified
10	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	73.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TTM	mAP	28.79	—	Unverified
2	CTRN	mAP	27.8	—	Unverified
3	Coarse-Fine Networks (w/ self-supervised detection pretraining)	mAP	26.95	—	Unverified
4	UniMD+Sync. (RGB+Flow)	mAP	26.53	—	Unverified
5	PDAN (RGB+Flow)	mAP	26.5	—	Unverified
6	PAT	mAP	26.5	—	Unverified
7	MS-TCT (RGB only)	mAP	25.4	—	Unverified
8	3D ResNet-50 + super-events pretrained on AViD	mAP	25.2	—	Unverified
9	Coarse-Fine Networks	mAP	25.1	—	Unverified
10	MLAD (RGB + Flow)	mAP	23.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLAD	mAP	51.5	—	Unverified
2	CTRN	mAP	51.2	—	Unverified
3	PDAN	mAP	47.6	—	Unverified
4	TGM	mAP	46.4	—	Unverified
5	MS-TCT (RGB only)	mAP	43.1	—	Unverified
6	I3D + our super-event	mAP	36.4	—	Unverified
7	Two-stream + LSTM	mAP	28.1	—	Unverified
8	Two-stream	mAP	27.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Two-in-one Two Stream	Video-mAP 0.5	96.52	—	Unverified
2	DTS	Video-mAP 0.2	94.3	—	Unverified
3	Two-in-one	Video-mAP 0.5	92.74	—	Unverified
4	T-CNN	Frame-mAP 0.5	86.7	—	Unverified
5	MR-TS R-CNN	Frame-mAP 0.5	84.52	—	Unverified
6	TS R-CNN	Frame-mAP 0.5	82.3	—	Unverified
7	Action Tubes	Frame-mAP 0.5	68.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MAT (Ours) Trans	mAP	71.6	—	Unverified
2	TadML-two stream	mAP	59.7	—	Unverified
3	MAT (ours)	mAP	58.2	—	Unverified
4	TadML-rgb	mAP	53.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIT	Frame-mAP 0.5	33.3	—	Unverified
2	SiA	Frame-mAP 0.5	28.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MS-TCT	Frame-mAP	33.7	—	Unverified
2	PDAN	Frame-mAP	32.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN	IoU	0.14	—	Unverified
2	Two Stream Network	IoU	0.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN-V2 (Vote decision)	IoU	0.52	—	Unverified
2	RGB and PRGB	IoU	0.35	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PAT	mAP	44.6	—	Unverified