Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 817 papers

Title	Date	Tasks	Status
Intelligent Video Recording Optimization using Activity Detection for Surveillance Systems	Nov 4, 2024	Action DetectionActivity Detection	—Unverified
On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes	Oct 25, 2024	Action DetectionData Augmentation	CodeCode Available
ContextDet: Temporal Action Detection with Adaptive Context Aggregation	Oct 20, 2024	Action DetectionVideo Understanding	—Unverified
CLIP-VAD: Exploiting Vision-Language Models for Voice Activity Detection	Oct 18, 2024	Action DetectionActivity Detection	—Unverified
A Framework for Adapting Human-Robot Interaction to Diverse User Groups	Oct 15, 2024	Action DetectionActivity Detection	CodeCode Available
Investigation of Speaker Representation for Target-Speaker Speech Processing	Oct 15, 2024	Action DetectionActivity Detection	—Unverified
Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action Detection	Oct 8, 2024	Action Detection	CodeCode Available
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos with Procedural Texts	Oct 7, 2024	Action DetectionMistake Detection	—Unverified
Query matching for spatio-temporal action detection with query-based object detector	Sep 27, 2024	Action DetectionObject	—Unverified
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks	Sep 27, 2024	Action DetectionAction Segmentation	—Unverified
Raising the Bar(ometer): Identifying a User's Stair and Lift Usage Through Wearable Sensor Data Analysis	Sep 18, 2024	Action DetectionActivity Detection	—Unverified
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses	Sep 17, 2024	Action DetectionActivity Detection	—Unverified
Uncertainty-Guided Appearance-Motion Association Network for Out-of-Distribution Action Detection	Sep 16, 2024	Action DetectionObject	CodeCode Available
TCG CREST System Description for the Second DISPLACE Challenge	Sep 16, 2024	Action DetectionActivity Detection	—Unverified
A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities	Sep 15, 2024	Action DetectionActivity Detection	—Unverified
Evaluation of real-time transcriptions using end-to-end ASR models	Sep 9, 2024	Action DetectionActivity Detection	—Unverified
NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge	Sep 9, 2024	Action DetectionActivity Detection	—Unverified
Introducing Gating and Context into Temporal Action Detection	Sep 6, 2024	Action Detection	—Unverified
Unfolding Videos Dynamics via Taylor Expansion	Sep 4, 2024	Action DetectionAction Recognition	—Unverified
Prediction-Feedback DETR for Temporal Action Detection	Aug 29, 2024	Action DetectionPrediction	—Unverified
Spatio-Temporal Context Prompting for Zero-Shot Action Detection	Aug 28, 2024	Action DetectionZero-Shot Action Detection	—Unverified
Temporal Divide-and-Conquer Anomaly Actions Localization in Semi-Supervised Videos with Hierarchical Transformer	Aug 24, 2024	Action DetectionAnomaly Detection	—Unverified
Long-term Pre-training for Temporal Action Detection with Transformers	Aug 23, 2024	Action Detection	—Unverified
Boundary-Recovering Network for Temporal Action Detection	Aug 18, 2024	Action Detectionobject-detection	—Unverified
JARViS: Detecting Actions in Video Using Unified Actor-Scene Context Relation Modeling	Aug 7, 2024	Action DetectionRelation	—Unverified
Blind User Activity Detection for Grant-Free Random Access in Cell-Free mMIMO Networks	Aug 5, 2024	Action DetectionActivity Detection	—Unverified
Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation	Aug 1, 2024	Action DetectionActivity Detection	—Unverified
Classification Matters: Improving Video Action Detection with Class-Specific Attention	Jul 29, 2024	Action DetectionClassification	—Unverified
MARINE: A Computer Vision Model for Detecting Rare Predator-Prey Interactions in Animal Videos	Jul 25, 2024	Action DetectionAction Recognition	CodeCode Available
Preemptive Detection and Correction of Misaligned Actions in LLM Agents	Jul 16, 2024	Action DetectionDecision Making	—Unverified
TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR	Jul 5, 2024	Action DetectionActivity Detection	CodeCode Available
Micro-gesture Online Recognition using Learnable Query Points	Jul 5, 2024	Action Detection	—Unverified
Automatic Speech Recognition for Hindi	Jun 26, 2024	Action DetectionActivity Detection	—Unverified
Using joint angles based on the international biomechanical standards for human action recognition and related tasks	Jun 25, 2024	Action DetectionAction Recognition	—Unverified
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation	Jun 25, 2024	Action DetectionBenchmarking	CodeCode Available
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024	Jun 24, 2024	Action DetectionActivity Detection	—Unverified
AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming	Jun 14, 2024	Action DetectionActivity Detection	—Unverified
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness	Jun 12, 2024	Action DetectionActivity Detection	—Unverified
Vessel Re-identification and Activity Detection in Thermal Domain for Maritime Surveillance	Jun 12, 2024	Action DetectionActivity Detection	—Unverified
Deep Learning-Based Approach for User Activity Detection with Grant-Free Random Access in Cell-Free Massive MIMO	Jun 11, 2024	Action DetectionActivity Detection	—Unverified
An Effective-Efficient Approach for Dense Multi-Label Action Detection	Jun 10, 2024	Action Detection	—Unverified
Object Aware Egocentric Online Action Detection	Jun 3, 2024	Action DetectionObject	—Unverified
Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access	Jun 3, 2024	Action DetectionActivity Detection	—Unverified
Skeleton-OOD: An End-to-End Skeleton-Based Model for Robust Out-of-Distribution Human Action Detection	May 31, 2024	Action DetectionAction Recognition	CodeCode Available
MALT: Multi-scale Action Learning Transformer for Online Action Detection	May 31, 2024	Action DetectionDecoder	—Unverified
A Real-Time Voice Activity Detection Based On Lightweight Neural	May 27, 2024	Action DetectionActivity Detection	—Unverified
Open-Vocabulary Spatio-Temporal Action Detection	May 17, 2024	Action DetectionFine-Grained Action Detection	—Unverified
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization	May 15, 2024	Action DetectionActivity Detection	—Unverified
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection	May 13, 2024	Action Detection	—Unverified
Whispy: Adapting STT Whisper Models to Real-Time Environments	May 6, 2024	Action DetectionActivity Detection	—Unverified

Show:10 25 50

← PrevPage 5 of 17Next →

All datasets UCF101-24 J-HMDB Charades Multi-THUMOS UCF Sports THUMOS' 14 MultiSports TSU TTStroke-21 ME21 TTStroke-21 ME22 MultiTHUMOS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	STAR/L	Frame-mAP 0.5	90.3	—	Unverified
2	SiA	Frame-mAP 0.5	88.5	—	Unverified
3	YOWO + LFB	Frame-mAP 0.5	87.3	—	Unverified
4	HIT	Frame-mAP 0.5	84.8	—	Unverified
5	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	82.3	—	Unverified
6	YOWO	Frame-mAP 0.5	80.4	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.2	78.48	—	Unverified
8	MOC	Frame-mAP 0.5	77.8	—	Unverified
9	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	76.3	—	Unverified
10	Two-in-one	Video-mAP 0.2	75.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SiA	Frame-mAP 0.5	88.5	—	Unverified
2	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	87.59	—	Unverified
3	HIT	Frame-mAP 0.5	83.8	—	Unverified
4	HISAN (VGG-16)	Frame-mAP 0.5	76.72	—	Unverified
5	DTS	Video-mAP 0.2	76.1	—	Unverified
6	YOWO + LFB	Frame-mAP 0.5	75.7	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.5	74.74	—	Unverified
8	YOWO	Frame-mAP 0.5	74.4	—	Unverified
9	MOC	Frame-mAP 0.5	74	—	Unverified
10	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	73.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TTM	mAP	28.79	—	Unverified
2	CTRN	mAP	27.8	—	Unverified
3	Coarse-Fine Networks (w/ self-supervised detection pretraining)	mAP	26.95	—	Unverified
4	UniMD+Sync. (RGB+Flow)	mAP	26.53	—	Unverified
5	PDAN (RGB+Flow)	mAP	26.5	—	Unverified
6	PAT	mAP	26.5	—	Unverified
7	MS-TCT (RGB only)	mAP	25.4	—	Unverified
8	3D ResNet-50 + super-events pretrained on AViD	mAP	25.2	—	Unverified
9	Coarse-Fine Networks	mAP	25.1	—	Unverified
10	MLAD (RGB + Flow)	mAP	23.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLAD	mAP	51.5	—	Unverified
2	CTRN	mAP	51.2	—	Unverified
3	PDAN	mAP	47.6	—	Unverified
4	TGM	mAP	46.4	—	Unverified
5	MS-TCT (RGB only)	mAP	43.1	—	Unverified
6	I3D + our super-event	mAP	36.4	—	Unverified
7	Two-stream + LSTM	mAP	28.1	—	Unverified
8	Two-stream	mAP	27.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Two-in-one Two Stream	Video-mAP 0.5	96.52	—	Unverified
2	DTS	Video-mAP 0.2	94.3	—	Unverified
3	Two-in-one	Video-mAP 0.5	92.74	—	Unverified
4	T-CNN	Frame-mAP 0.5	86.7	—	Unverified
5	MR-TS R-CNN	Frame-mAP 0.5	84.52	—	Unverified
6	TS R-CNN	Frame-mAP 0.5	82.3	—	Unverified
7	Action Tubes	Frame-mAP 0.5	68.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MAT (Ours) Trans	mAP	71.6	—	Unverified
2	TadML-two stream	mAP	59.7	—	Unverified
3	MAT (ours)	mAP	58.2	—	Unverified
4	TadML-rgb	mAP	53.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIT	Frame-mAP 0.5	33.3	—	Unverified
2	SiA	Frame-mAP 0.5	28.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MS-TCT	Frame-mAP	33.7	—	Unverified
2	PDAN	Frame-mAP	32.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN	IoU	0.14	—	Unverified
2	Two Stream Network	IoU	0.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN-V2 (Vote decision)	IoU	0.52	—	Unverified
2	RGB and PRGB	IoU	0.35	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PAT	mAP	44.6	—	Unverified