Action Detection

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 817 papers

Title	Date	Tasks	Status	Hype
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description	Jan 17, 2023	Action DetectionActivity Detection	—Unverified	0
Deep learning-based approaches for human motion decoding in smart walkers for rehabilitation	Jan 13, 2023	Action DetectionAction Recognition	—Unverified	0
KIDS: kinematics-based (in)activity detection and segmentation in a sleep case study	Jan 4, 2023	Action DetectionActivity Detection	—Unverified	0
Ego-Only: Egocentric Action Detection without Exocentric Transferring	Jan 3, 2023	Action DetectionAction Localization	—Unverified	0
MiniROAD: Minimal RNN Framework for Online Action Detection	Jan 1, 2023	Action DetectionOnline Action Detection	CodeCode Available	1
Movement Enhancement toward Multi-Scale Video Feature Representation for Temporal Action Detection	Jan 1, 2023	Action Detection	—Unverified	0
SkeleTR: Towards Skeleton-based Action Recognition in the Wild	Jan 1, 2023	Action ClassificationAction Detection	—Unverified	0
Hybrid Active Learning via Deep Clustering for Video Action Detection	Jan 1, 2023	Action DetectionActive Learning	—Unverified	0
Activity Detection for Grant-Free NOMA in Massive IoT Networks	Dec 23, 2022	Action DetectionActivity Detection	—Unverified	0
Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features	Dec 20, 2022	Action DetectionOptical Flow Estimation	—Unverified	0
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks	Dec 14, 2022	Action DetectionActivity Detection	—Unverified	0
Trajectory-User Linking Is Easier Than You Think	Dec 14, 2022	Action DetectionActivity Detection	—Unverified	0
Contextual Explainable Video Representation: Human Perception-based Understanding	Dec 12, 2022	Action DetectionAction Recognition	CodeCode Available	0
BC-VAD: A Robust Bone Conduction Voice Activity Detection	Dec 6, 2022	Action DetectionActivity Detection	—Unverified	0
Proximal Gradient-Based Unfolding for Massive Random Access in IoT Networks	Dec 4, 2022	Action DetectionActivity Detection	—Unverified	0
Joint Estimation of Clustered User Activity and Correlated Channels with Unknown Covariance in mMTC	Nov 30, 2022	Action DetectionActivity Detection	—Unverified	0
Post-Processing Temporal Action Detection	Nov 27, 2022	Action ClassificationAction Detection	CodeCode Available	1
Multi-Modal Few-Shot Temporal Action Detection	Nov 27, 2022	Action DetectionFew-Shot Object Detection	CodeCode Available	1
Multi-timescale Event Detection in Nonintrusive Load Monitoring based on MDL Principle	Nov 19, 2022	Action DetectionActivity Detection	—Unverified	0
On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches	Nov 16, 2022	Action DetectionActivity Detection	—Unverified	0
Token Turing Machines	Nov 16, 2022	Action DetectionActivity Detection	—Unverified	0
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization	Nov 12, 2022	Action DetectionActivity Detection	CodeCode Available	1
Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection	Nov 8, 2022	Action DetectionActivity Detection	—Unverified	0
OFDM-Based Massive Connectivity for LEO Satellite Internet of Things	Oct 31, 2022	Action DetectionActivity Detection	—Unverified	0
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction	Oct 28, 2022	Action DetectionActivity Detection	—Unverified	0
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition	Oct 28, 2022	Action DetectionActivity Detection	—Unverified	0
SG-VAD: Stochastic Gates Based Speech Activity Detection	Oct 28, 2022	Action DetectionActivity Detection	CodeCode Available	1
Handwashing Action Detection System for an Autonomous Social Robot	Oct 27, 2022	Action DetectionAction Recognition	CodeCode Available	0
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0	Oct 26, 2022	Action DetectionActivity Detection	CodeCode Available	1
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge	Oct 26, 2022	Action DetectionActivity Detection	—Unverified	0
Refining Action Boundaries for One-stage Detection	Oct 25, 2022	Action Detection	CodeCode Available	0
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation	Oct 24, 2022	Action DetectionActivity Detection	CodeCode Available	1
Holistic Interaction Transformer Network for Action Detection	Oct 23, 2022	Action DetectionAction Recognition	CodeCode Available	1
PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points	Oct 20, 2022	Action DetectionTemporal Action Localization	CodeCode Available	1
YOWO-Plus: An Incremental Improvement	Oct 20, 2022	Action DetectionGPU	CodeCode Available	1
mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors	Oct 15, 2022	3D Human Pose EstimationAction Detection	—Unverified	0
Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization	Oct 14, 2022	Action DetectionActive Speaker Detection	—Unverified	0
Application-Driven AI Paradigm for Hand-Held Action Detection	Oct 13, 2022	Action DetectionObject	—Unverified	0
AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation	Oct 5, 2022	Action DetectionTemporal Action Proposal Generation	CodeCode Available	1
The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022	Oct 4, 2022	Action DetectionActivity Detection	—Unverified	0
Learnable Acoustic Frontends in Bird Activity Detection	Oct 3, 2022	Action DetectionActivity Detection	—Unverified	0
Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection	Sep 28, 2022	Action DetectionDomain Adaptation	CodeCode Available	1
Signed Latent Factors for Spamming Activity Detection	Sep 28, 2022	Action DetectionActivity Detection	—Unverified	0
RALACs: Action Recognition in Autonomous Vehicles using Interaction Encoding and Optical Flow	Sep 28, 2022	Action ClassificationAction Detection	CodeCode Available	0
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture	Sep 24, 2022	Action DetectionActivity Detection	—Unverified	0
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022	Sep 23, 2022	Action DetectionActivity Detection	—Unverified	0
Cross-domain Voice Activity Detection with Self-Supervised Representations	Sep 22, 2022	Action DetectionActivity Detection	—Unverified	0
GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge	Sep 21, 2022	Action DetectionActivity Detection	—Unverified	0
Exploring Modulated Detection Transformer as a Tool for Action Recognition in Videos	Sep 21, 2022	Action DetectionAction Recognition	CodeCode Available	0
Real-time Online Video Detection with Temporal Smoothing Transformers	Sep 19, 2022	Action AnticipationAction Detection	CodeCode Available	1

Show:10 25 50

← PrevPage 6 of 17Next →

All datasets UCF101-24 J-HMDB Charades Multi-THUMOS UCF Sports THUMOS' 14 MultiSports TSU TTStroke-21 ME21 TTStroke-21 ME22 MultiTHUMOS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	STAR/L	Frame-mAP 0.5	90.3	—	Unverified
2	SiA	Frame-mAP 0.5	88.5	—	Unverified
3	YOWO + LFB	Frame-mAP 0.5	87.3	—	Unverified
4	HIT	Frame-mAP 0.5	84.8	—	Unverified
5	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	82.3	—	Unverified
6	YOWO	Frame-mAP 0.5	80.4	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.2	78.48	—	Unverified
8	MOC	Frame-mAP 0.5	77.8	—	Unverified
9	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	76.3	—	Unverified
10	Two-in-one	Video-mAP 0.2	75.48	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SiA	Frame-mAP 0.5	88.5	—	Unverified
2	HISAN (ResNet-101 + FPN)	Video-mAP 0.2	87.59	—	Unverified
3	HIT	Frame-mAP 0.5	83.8	—	Unverified
4	HISAN (VGG-16)	Frame-mAP 0.5	76.72	—	Unverified
5	DTS	Video-mAP 0.2	76.1	—	Unverified
6	YOWO + LFB	Frame-mAP 0.5	75.7	—	Unverified
7	Two-in-one Two Stream	Video-mAP 0.5	74.74	—	Unverified
8	YOWO	Frame-mAP 0.5	74.4	—	Unverified
9	MOC	Frame-mAP 0.5	74	—	Unverified
10	Faster-RCNN + two-stream I3D conv	Frame-mAP 0.5	73.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TTM	mAP	28.79	—	Unverified
2	CTRN	mAP	27.8	—	Unverified
3	Coarse-Fine Networks (w/ self-supervised detection pretraining)	mAP	26.95	—	Unverified
4	UniMD+Sync. (RGB+Flow)	mAP	26.53	—	Unverified
5	PDAN (RGB+Flow)	mAP	26.5	—	Unverified
6	PAT	mAP	26.5	—	Unverified
7	MS-TCT (RGB only)	mAP	25.4	—	Unverified
8	3D ResNet-50 + super-events pretrained on AViD	mAP	25.2	—	Unverified
9	Coarse-Fine Networks	mAP	25.1	—	Unverified
10	I3D + biGRU + VS-ST-MPNN	mAP	23.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MLAD	mAP	51.5	—	Unverified
2	CTRN	mAP	51.2	—	Unverified
3	PDAN	mAP	47.6	—	Unverified
4	TGM	mAP	46.4	—	Unverified
5	MS-TCT (RGB only)	mAP	43.1	—	Unverified
6	I3D + our super-event	mAP	36.4	—	Unverified
7	Two-stream + LSTM	mAP	28.1	—	Unverified
8	Two-stream	mAP	27.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Two-in-one Two Stream	Video-mAP 0.5	96.52	—	Unverified
2	DTS	Video-mAP 0.2	94.3	—	Unverified
3	Two-in-one	Video-mAP 0.5	92.74	—	Unverified
4	T-CNN	Frame-mAP 0.5	86.7	—	Unverified
5	MR-TS R-CNN	Frame-mAP 0.5	84.52	—	Unverified
6	TS R-CNN	Frame-mAP 0.5	82.3	—	Unverified
7	Action Tubes	Frame-mAP 0.5	68.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MAT (Ours) Trans	mAP	71.6	—	Unverified
2	TadML-two stream	mAP	59.7	—	Unverified
3	MAT (ours)	mAP	58.2	—	Unverified
4	TadML-rgb	mAP	53.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIT	Frame-mAP 0.5	33.3	—	Unverified
2	SiA	Frame-mAP 0.5	28.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MS-TCT	Frame-mAP	33.7	—	Unverified
2	PDAN	Frame-mAP	32.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN	IoU	0.14	—	Unverified
2	Two Stream Network	IoU	0.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STCNN-V2 (Vote decision)	IoU	0.52	—	Unverified
2	RGB and PRGB	IoU	0.35	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PAT	mAP	44.6	—	Unverified