SOTAVerified

Action Localization

Action Localization is finding the spatial and temporal co ordinates for an action in a video. An action localization model will identify which frame an action start and ends in video and return the x,y coordinates of an action. Further the co ordinates will change when the object performing action undergoes a displacement.

Papers

Showing 150 of 369 papers

TitleStatusHype
Zero-Shot Temporal Interaction Localization for Egocentric VideosCode1
LLM-powered Query Expansion for Enhancing Boundary Prediction in Language-driven Action Localization0
CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization0
DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity RecognitionCode0
ProTAL: A Drag-and-Link Video Programming Framework for Temporal Action Localization0
Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges0
Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer0
Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text PromptsCode0
Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization0
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long VideosCode1
Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues0
XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and GlassesCode1
Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction0
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement PredictionCode0
Boosting Point-Supervised Temporal Action Localization through Integrating Query Reformation and Optimal Transport0
Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models0
DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments0
Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action LocalizationCode0
Stitch Contrast and Segment_Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos0
Temporal Action Localization with Cross Layer Task Decoupling and RefinementCode1
Multilevel semantic and adaptive actionness learning for weakly supervised temporal action localizationCode0
Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization0
IMUVIE: Pickup Timeline Action Localization via Motion Movies0
Can MLLMs Guide Weakly-Supervised Temporal Action Localization Tasks?0
Zero-shot Action Localization via the Confidence of Large Vision-Language Models0
Transformer with Controlled Attention for Synchronous Motion CaptioningCode0
Unified Framework with Consistency across Modalities for Human Activity RecognitionCode0
Open-Vocabulary Action Localization with Iterative Visual PromptingCode1
FMI-TAL: Few-shot Multiple Instances Temporal Action Localization by Probability Distribution Learning and Interval Cluster RefinementCode0
Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action LocalizationCode1
HAT: History-Augmented Anchor Transformer for Online Temporal Action LocalizationCode1
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action LocalizationCode1
Online Temporal Action Localization with Memory-Augmented Transformer0
Semi-Supervised Pipe Video Temporal Defect Interval Localization0
Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent MechanismCode1
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming VideosCode1
Full-Stage Pseudo Label Quality Enhancement for Weakly-supervised Temporal Action LocalizationCode0
Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization0
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action LocalizationCode1
Referring Atomic Video Action RecognitionCode1
The Surprising Effectiveness of Multimodal Large Language Models for Video Moment RetrievalCode2
Open-Vocabulary Temporal Action Localization using Multimodal Guidance0
Self-supervised Multi-actor Social Activity Understanding in Streaming Videos0
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
ViTALS: Vision Transformer for Action Localization in Surgical Nephrectomy0
SFMViT: SlowFast Meet ViT in Chaotic WorldCode1
STAT: Towards Generalizable Temporal Action Localization0
DeepLocalization: Using change point detection for Temporal Action Localization0
Weakly supervised temporal action localization with actionness-guided false positive suppressionCode0
Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder0
Show:102550
← PrevPage 1 of 8Next →

No leaderboard results yet.