SOTAVerified

Action Localization

Action Localization is finding the spatial and temporal co ordinates for an action in a video. An action localization model will identify which frame an action start and ends in video and return the x,y coordinates of an action. Further the co ordinates will change when the object performing action undergoes a displacement.

Papers

Showing 5175 of 369 papers

TitleStatusHype
Bottom-Up Temporal Action Localization with Mutual RegularizationCode1
OpenTAL: Towards Open Set Temporal Action LocalizationCode1
PDAN: Pyramid Dilated Attention Network for Action DetectionCode1
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action LocalizationCode1
Actor-Context-Actor Relation Network for Spatio-Temporal Action LocalizationCode1
Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action TripletsCode1
Boosting Weakly-Supervised Temporal Action Localization with Text InformationCode1
Revisiting Anchor Mechanisms for Temporal Action LocalizationCode1
SFMViT: SlowFast Meet ViT in Chaotic WorldCode1
SF-Net: Single-Frame Supervision for Temporal Action LocalizationCode1
BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal GenerationCode1
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action LocalizationCode1
DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action LocalizationCode1
End-to-End Learning of Visual Representations from Uncurated Instructional VideosCode1
E^2TAD: An Energy-Efficient Tracking-based Action DetectorCode1
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
CBR-Net: Cascade Boundary Refinement Network for Action Detection: Submission to ActivityNet Challenge 2020 (Task 1)Code1
Actionness Inconsistency-guided Contrastive Learning for Weakly-supervised Temporal Action LocalizationCode1
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action LocalizationCode1
CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive LearningCode1
Everything at Once - Multi-Modal Fusion Transformer for Video RetrievalCode1
Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic EventsCode1
Few-Shot Temporal Action Localization with Query Adaptive TransformerCode1
Multi-Granularity Hand Action DetectionCode1
Everything at Once -- Multi-modal Fusion Transformer for Video RetrievalCode1
Show:102550
← PrevPage 3 of 15Next →

No leaderboard results yet.