SOTAVerified

Action Localization

Action Localization is finding the spatial and temporal co ordinates for an action in a video. An action localization model will identify which frame an action start and ends in video and return the x,y coordinates of an action. Further the co ordinates will change when the object performing action undergoes a displacement.

Papers

Showing 51100 of 369 papers

TitleStatusHype
Temporal Action Proposal Generation with Background ConstraintCode1
Everything at Once -- Multi-modal Fusion Transformer for Video RetrievalCode1
Background-Click Supervision for Temporal Action LocalizationCode1
Towards Active Vision for Action Localization with Reactive Control and Predictive LearningCode1
Few-Shot Temporal Action Localization with Query Adaptive TransformerCode1
Foreground-Action Consistency Network for Weakly Supervised Temporal Action LocalizationCode1
Learning Action Completeness from Points for Weakly-supervised Temporal Action LocalizationCode1
Video Contrastive Learning with Global ContextCode1
Enriching Local and Global Contexts for Temporal Action LocalizationCode1
Cross-modal Consensus Network forWeakly Supervised Temporal Action LocalizationCode1
Cross-modal Consensus Network for Weakly Supervised Temporal Action LocalizationCode1
Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action LocalizationCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
FineAction: A Fine-Grained Video Dataset for Temporal Action LocalizationCode1
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports ActionsCode1
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled VideosCode1
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action LocalizationCode1
TubeR: Tubelet Transformer for Video Action DetectionCode1
CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive LearningCode1
Temporal Context Aggregation Network for Temporal Action Proposal RefinementCode1
Learning Salient Boundary Feature for Anchor-free Temporal Action LocalizationCode1
The Blessings of Unlabeled Background in Untrimmed VideosCode1
Modeling Multi-Label Action Dependencies for Temporal Action LocalizationCode1
PDAN: Pyramid Dilated Attention Network for Action DetectionCode1
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action LocalizationCode1
Multi-shot Temporal Event Localization: a BenchmarkCode1
VideoMix: Rethinking Data Augmentation for Video ClassificationCode1
Video Self-Stitching Graph Network for Temporal Action LocalizationCode1
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization TasksCode1
BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal GenerationCode1
Learning to Localize Actions from MomentsCode1
Revisiting Anchor Mechanisms for Temporal Action LocalizationCode1
Localizing the Common Action Among a Few VideosCode1
Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action TripletsCode1
1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020Code1
Actor-Context-Actor Relation Network for Spatio-Temporal Action LocalizationCode1
CBR-Net: Cascade Boundary Refinement Network for Action Detection: Submission to ActivityNet Challenge 2020 (Task 1)Code1
Weakly-supervised Temporal Action Localization by Uncertainty ModelingCode1
Weakly-Supervised Action Localization by Generative Attention ModelingCode1
SF-Net: Single-Frame Supervision for Temporal Action LocalizationCode1
Bottom-Up Temporal Action Localization with Mutual RegularizationCode1
Weakly Supervised Temporal Action Localization Using Deep Metric LearningCode1
End-to-End Learning of Visual Representations from Uncurated Instructional VideosCode1
Learning Sparse 2D Temporal Adjacent Networks for Temporal Action LocalizationCode1
Background Suppression Network for Weakly-supervised Temporal Action LocalizationCode1
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video ClipsCode1
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual ActionsCode1
LLM-powered Query Expansion for Enhancing Boundary Prediction in Language-driven Action Localization0
CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization0
DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity RecognitionCode0
Show:102550
← PrevPage 2 of 8Next →

No leaderboard results yet.