Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 817 papers

Title	Date	Tasks	Status
ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos	Apr 15, 2020	Action DetectionAction Spotting	—Unverified
A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments	Oct 6, 2020	Action DetectionActivity Detection	—Unverified
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation	Mar 30, 2021	Action DetectionTemporal Action Proposal Generation	—Unverified
A Grammatical Compositional Model for Video Action Detection	Oct 4, 2023	Action DetectionHuman Dynamics	—Unverified
Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion	Jun 2, 2025	Action DetectionActivity Detection	—Unverified
Aggressive actions and anger detection from multiple modalities using Kinect	Jul 5, 2016	Action Detection	—Unverified
A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities	Sep 15, 2024	Action DetectionActivity Detection	—Unverified
Cross-Channel Attention-Based Target Speaker Voice Activity Detection: Experimental Results for M2MeT Challenge	Feb 6, 2022	Action DetectionActivity Detection	—Unverified
Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures	Jul 21, 2019	Action DetectionGeneral Classification	—Unverified
ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding	Apr 17, 2023	Action DetectionAction Recognition	—Unverified
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection	Oct 18, 2019	Action Detectionobject-detection	—Unverified
A Time-Frequency based Suspicious Activity Detection for Anti-Money Laundering	Nov 17, 2020	Action DetectionActivity Detection	—Unverified
A Temporal Simulator for Developing Turn-Taking Methods for Spoken Dialogue Systems	Jul 1, 2012	Action DetectionSpeech Recognition	—Unverified
Continuous Human Action Detection Based on Wearable Inertial Data	Dec 11, 2021	Action DetectionGesture Recognition	—Unverified
Asynchronous Random Access in Massive MIMO Systems Facilitated by the Delay-Angle Domain	Dec 6, 2024	Action DetectionActivity Detection	—Unverified
A Flexible Framework for Grant-Free Random Access in Cell-Free Massive MIMO Systems	Nov 14, 2024	Action DetectionActivity Detection	—Unverified
A Survey on Recent Advances of Computer Vision Algorithms for Egocentric Video	Jan 12, 2015	Action DetectionActivity Detection	—Unverified
A Survey on Deep Learning-based Spatio-temporal Action Detection	Aug 3, 2023	Action DetectionAutonomous Driving	—Unverified
A Circular Window-based Cascade Transformer for Online Action Detection	Aug 30, 2022	Action DetectionAction Segmentation	—Unverified
Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection	Jul 16, 2018	Action DetectionObject	—Unverified
Cross-domain Voice Activity Detection with Self-Supervised Representations	Sep 22, 2022	Action DetectionActivity Detection	—Unverified
A Study on Action Detection in the Wild	Apr 29, 2019	Action Detection	—Unverified
A Structured Model For Action Detection	Dec 9, 2018	Action Detectionmodel	—Unverified
A Stronger Baseline for Ego-Centric Action Detection	Jun 13, 2021	Action DetectionVideo Action Detection	—Unverified
whu-nercms at trecvid2021:instance search task	Oct 30, 2021	Action DetectionFace Detection	—Unverified

Show:10 25 50

← PrevPage 7 of 33Next →

All datasets UCF101-24 J-HMDB Charades Multi-THUMOS UCF Sports THUMOS' 14 MultiSports TSU TTStroke-21 ME21 TTStroke-21 ME22 MultiTHUMOS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	STAR/L	Frame-mAP 0.5	90.3	—	Unverified
2	SiA	Frame-mAP 0.5	88.5	—	Unverified
3	YOWO + LFB	Frame-mAP 0.5	87.3	—	Unverified
4	HIT	Frame-mAP 0.5	84.8	—	Unverified
5	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	82.3	—	Unverified
6	YOWO	Frame-mAP 0.5	80.4	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.2	78.48	—	Unverified
8	MOC	Frame-mAP 0.5	77.8	—	Unverified
9	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	76.3	—	Unverified
10	Two-in-one	Video-mAP 0.2	75.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SiA	Frame-mAP 0.5	88.5	—	Unverified
2	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	87.59	—	Unverified
3	HIT	Frame-mAP 0.5	83.8	—	Unverified
4	HISAN (VGG-16)	Frame-mAP 0.5	76.72	—	Unverified
5	DTS	Video-mAP 0.2	76.1	—	Unverified
6	YOWO + LFB	Frame-mAP 0.5	75.7	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.5	74.74	—	Unverified
8	YOWO	Frame-mAP 0.5	74.4	—	Unverified
9	MOC	Frame-mAP 0.5	74	—	Unverified
10	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	73.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TTM	mAP	28.79	—	Unverified
2	CTRN	mAP	27.8	—	Unverified
3	Coarse-Fine Networks (w/ self-supervised detection pretraining)	mAP	26.95	—	Unverified
4	UniMD+Sync. (RGB+Flow)	mAP	26.53	—	Unverified
5	PDAN (RGB+Flow)	mAP	26.5	—	Unverified
6	PAT	mAP	26.5	—	Unverified
7	MS-TCT (RGB only)	mAP	25.4	—	Unverified
8	3D ResNet-50 + super-events pretrained on AViD	mAP	25.2	—	Unverified
9	Coarse-Fine Networks	mAP	25.1	—	Unverified
10	MLAD (RGB + Flow)	mAP	23.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLAD	mAP	51.5	—	Unverified
2	CTRN	mAP	51.2	—	Unverified
3	PDAN	mAP	47.6	—	Unverified
4	TGM	mAP	46.4	—	Unverified
5	MS-TCT (RGB only)	mAP	43.1	—	Unverified
6	I3D + our super-event	mAP	36.4	—	Unverified
7	Two-stream + LSTM	mAP	28.1	—	Unverified
8	Two-stream	mAP	27.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Two-in-one Two Stream	Video-mAP 0.5	96.52	—	Unverified
2	DTS	Video-mAP 0.2	94.3	—	Unverified
3	Two-in-one	Video-mAP 0.5	92.74	—	Unverified
4	T-CNN	Frame-mAP 0.5	86.7	—	Unverified
5	MR-TS R-CNN	Frame-mAP 0.5	84.52	—	Unverified
6	TS R-CNN	Frame-mAP 0.5	82.3	—	Unverified
7	Action Tubes	Frame-mAP 0.5	68.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MAT (Ours) Trans	mAP	71.6	—	Unverified
2	TadML-two stream	mAP	59.7	—	Unverified
3	MAT (ours)	mAP	58.2	—	Unverified
4	TadML-rgb	mAP	53.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIT	Frame-mAP 0.5	33.3	—	Unverified
2	SiA	Frame-mAP 0.5	28.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MS-TCT	Frame-mAP	33.7	—	Unverified
2	PDAN	Frame-mAP	32.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN	IoU	0.14	—	Unverified
2	Two Stream Network	IoU	0.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN-V2 (Vote decision)	IoU	0.52	—	Unverified
2	RGB and PRGB	IoU	0.35	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PAT	mAP	44.6	—	Unverified