Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 817 papers

Title	Date	Tasks	Status
M^33D: Learning 3D priors using Multi-Modal Masked Autoencoders for 2D image and video understanding	Sep 26, 2023	2D Semantic SegmentationAction Detection	—Unverified
ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object Interactions in Industrial Scenarios	Sep 26, 2023	Action DetectionHuman-Object Interaction Detection	—Unverified
The Impact of Silence on Speech Anti-Spoofing	Sep 21, 2023	Action DetectionActivity Detection	—Unverified
SkeleTR: Towrads Skeleton-based Action Recognition in the Wild	Sep 20, 2023	Action ClassificationAction Detection	—Unverified
JOADAA: joint online action detection and action anticipation	Sep 12, 2023	Action AnticipationAction Detection	—Unverified
Effective Abnormal Activity Detection on Multivariate Time Series Healthcare Data	Sep 11, 2023	Action DetectionActivity Detection	—Unverified
In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms	Sep 5, 2023	Action DetectionActivity Detection	—Unverified
Self-Feedback DETR for Temporal Action Detection	Aug 21, 2023	Action DetectionDecoder	—Unverified
Progression-Guided Temporal Action Detection in Videos	Aug 18, 2023	Action ClassificationAction Detection	CodeCode Available
The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023	Aug 15, 2023	Action DetectionActivity Detection	—Unverified
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations	Aug 14, 2023	Action DetectionActivity Detection	CodeCode Available
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection	Aug 9, 2023	Action DetectionEvent Detection	—Unverified
A Survey on Deep Learning-based Spatio-temporal Action Detection	Aug 3, 2023	Action DetectionAutonomous Driving	—Unverified
An enhanced system for the detection and active cancellation of snoring signals	Jul 31, 2023	Action DetectionActivity Detection	—Unverified
Human-to-Human Interaction Detection	Jul 2, 2023	Action DetectionHuman Interaction Recognition	—Unverified
Long-term Conversation Analysis: Exploring Utility and Privacy	Jun 28, 2023	Action DetectionActivity Detection	CodeCode Available
ShuttleSet: A Human-Annotated Stroke-Level Singles Dataset for Badminton Tactical Analysis	Jun 8, 2023	Action DetectionSports Analytics	CodeCode Available
Multi-microphone Automatic Speech Segmentation in Meetings Based on Circular Harmonics Features	Jun 7, 2023	Action DetectionActivity Detection	—Unverified
Parallel Neurosymbolic Integration with Concordia	Jun 1, 2023	Action DetectionActivity Detection	—Unverified
SVVAD: Personal Voice Activity Detection for Speaker Verification	May 31, 2023	Action DetectionActivity Detection	—Unverified
A Multi-Modal Transformer Network for Action Detection	May 31, 2023	Action DetectionOptical Flow Estimation	—Unverified
Building Accurate Low Latency ASR for Streaming Voice Search	May 29, 2023	Action DetectionActivity Detection	—Unverified
Joint Activity-Delay Detection and Channel Estimation for Asynchronous Massive Random Access	May 21, 2023	Action DetectionActivity Detection	—Unverified
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction	May 21, 2023	Action DetectionActivity Detection	—Unverified
FunASR: A Fundamental End-to-End Speech Recognition Toolkit	May 18, 2023	Action DetectionActivity Detection	—Unverified
Deep Learning for Asynchronous Massive Access with Data Frame Length Diversity	May 12, 2023	Action DetectionActivity Detection	—Unverified
Joint Activity Detection and Channel Estimation for Clustered Massive Machine Type Communications	May 4, 2023	Action DetectionActivity Detection	—Unverified
MRSN: Multi-Relation Support Network for Video Action Detection	Apr 24, 2023	Action DetectionRelation	—Unverified
End-to-End Spatio-Temporal Action Localisation with Video Transformers	Apr 24, 2023	Action DetectionAction Recognition	—Unverified
Cooperative Multi-Cell Massive Access with Temporally Correlated Activity	Apr 19, 2023	Action DetectionActivity Detection	—Unverified
Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence	Apr 18, 2023	Action DetectionActivity Detection	—Unverified
ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding	Apr 17, 2023	Action DetectionAction Recognition	—Unverified
Grant-free Massive Random Access with Retransmission: Receiver Optimization and Performance Analysis	Apr 12, 2023	Action DetectionActivity Detection	—Unverified
Boundary-Denoising for Video Activity Localization	Apr 6, 2023	Action DetectionDecoder	CodeCode Available
Improve Temporal Action Proposals using Hierarchical Context	Apr 3, 2023	Action DetectionTemporal Action Localization	—Unverified
DOAD: Decoupled One Stage Action Detection Network	Apr 1, 2023	Action DetectionAction Recognition	—Unverified
Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking Listeners	Mar 31, 2023	Action DetectionActivity Detection	CodeCode Available
Decomposed Cross-modal Distillation for RGB-based Temporal Action Detection	Mar 30, 2023	Action DetectionAction Localization	—Unverified
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection	Mar 28, 2023	Action DetectionAction Recognition	CodeCode Available
Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV	Mar 23, 2023	Action DetectionActivity Detection	—Unverified
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations	Mar 21, 2023	Action DetectionActivity Detection	—Unverified
A processing framework to access large quantities of whispered speech found in ASMR	Mar 13, 2023	Action DetectionActivity Detection	—Unverified
Multi-Task Sub-Band Network For Deep Residual Echo Suppression	Mar 11, 2023	Action DetectionActivity Detection	—Unverified
Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads	Mar 2, 2023	Action DetectionActivity Detection	—Unverified
Open Set Action Recognition via Multi-Label Evidential Learning	Feb 27, 2023	Action DetectionAction Recognition	—Unverified
Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation	Feb 20, 2023	Action DetectionActivity Detection	—Unverified
MINOTAUR: Multi-task Video Grounding From Multimodal Queries	Feb 16, 2023	Action DetectionSentence	CodeCode Available
Context Understanding in Computer Vision: A Survey	Feb 10, 2023	Action Detectionimage-classification	—Unverified
Understanding Policy and Technical Aspects of AI-Enabled Smart Video Surveillance to Address Public Safety	Feb 8, 2023	Action DetectionAnomaly Detection	—Unverified
Fine-Grained Action Detection with RGB and Pose Information using Two Stream Convolutional Networks	Feb 6, 2023	Action ClassificationAction Detection	CodeCode Available

Show:10 25 50

← PrevPage 7 of 17Next →

All datasets UCF101-24 J-HMDB Charades Multi-THUMOS UCF Sports THUMOS' 14 MultiSports TSU TTStroke-21 ME21 TTStroke-21 ME22 MultiTHUMOS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	STAR/L	Frame-mAP 0.5	90.3	—	Unverified
2	SiA	Frame-mAP 0.5	88.5	—	Unverified
3	YOWO + LFB	Frame-mAP 0.5	87.3	—	Unverified
4	HIT	Frame-mAP 0.5	84.8	—	Unverified
5	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	82.3	—	Unverified
6	YOWO	Frame-mAP 0.5	80.4	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.2	78.48	—	Unverified
8	MOC	Frame-mAP 0.5	77.8	—	Unverified
9	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	76.3	—	Unverified
10	Two-in-one	Video-mAP 0.2	75.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SiA	Frame-mAP 0.5	88.5	—	Unverified
2	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	87.59	—	Unverified
3	HIT	Frame-mAP 0.5	83.8	—	Unverified
4	HISAN (VGG-16)	Frame-mAP 0.5	76.72	—	Unverified
5	DTS	Video-mAP 0.2	76.1	—	Unverified
6	YOWO + LFB	Frame-mAP 0.5	75.7	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.5	74.74	—	Unverified
8	YOWO	Frame-mAP 0.5	74.4	—	Unverified
9	MOC	Frame-mAP 0.5	74	—	Unverified
10	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	73.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TTM	mAP	28.79	—	Unverified
2	CTRN	mAP	27.8	—	Unverified
3	Coarse-Fine Networks (w/ self-supervised detection pretraining)	mAP	26.95	—	Unverified
4	UniMD+Sync. (RGB+Flow)	mAP	26.53	—	Unverified
5	PDAN (RGB+Flow)	mAP	26.5	—	Unverified
6	PAT	mAP	26.5	—	Unverified
7	MS-TCT (RGB only)	mAP	25.4	—	Unverified
8	3D ResNet-50 + super-events pretrained on AViD	mAP	25.2	—	Unverified
9	Coarse-Fine Networks	mAP	25.1	—	Unverified
10	I3D + biGRU + VS-ST-MPNN	mAP	23.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLAD	mAP	51.5	—	Unverified
2	CTRN	mAP	51.2	—	Unverified
3	PDAN	mAP	47.6	—	Unverified
4	TGM	mAP	46.4	—	Unverified
5	MS-TCT (RGB only)	mAP	43.1	—	Unverified
6	I3D + our super-event	mAP	36.4	—	Unverified
7	Two-stream + LSTM	mAP	28.1	—	Unverified
8	Two-stream	mAP	27.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Two-in-one Two Stream	Video-mAP 0.5	96.52	—	Unverified
2	DTS	Video-mAP 0.2	94.3	—	Unverified
3	Two-in-one	Video-mAP 0.5	92.74	—	Unverified
4	T-CNN	Frame-mAP 0.5	86.7	—	Unverified
5	MR-TS R-CNN	Frame-mAP 0.5	84.52	—	Unverified
6	TS R-CNN	Frame-mAP 0.5	82.3	—	Unverified
7	Action Tubes	Frame-mAP 0.5	68.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MAT (Ours) Trans	mAP	71.6	—	Unverified
2	TadML-two stream	mAP	59.7	—	Unverified
3	MAT (ours)	mAP	58.2	—	Unverified
4	TadML-rgb	mAP	53.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIT	Frame-mAP 0.5	33.3	—	Unverified
2	SiA	Frame-mAP 0.5	28.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MS-TCT	Frame-mAP	33.7	—	Unverified
2	PDAN	Frame-mAP	32.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN	IoU	0.14	—	Unverified
2	Two Stream Network	IoU	0.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN-V2 (Vote decision)	IoU	0.52	—	Unverified
2	RGB and PRGB	IoU	0.35	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PAT	mAP	44.6	—	Unverified