SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 626650 of 1149 papers

TitleStatusHype
Contrastive Language-Action Pre-training for Temporal Localization0
Contrastive Language Video Time Pre-training0
CoS: Chain-of-Shot Prompting for Long Video Understanding0
CRCL: Causal Representation Consistency Learning for Anomaly Detection in Surveillance Videos0
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization0
Cross-Class Relevance Learning for Temporal Concept Localization0
CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding0
CTM: Collaborative Temporal Modeling for Action Recognition0
Cultivating DNN Diversity for Large Scale Video Labelling0
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data0
Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model0
Cycle-Contrast for Self-Supervised Video Representation Learning0
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description0
Deep learning for action spotting in association football videos0
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation0
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding0
DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding0
Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection0
Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding0
Discerning Generic Event Boundaries in Long-Form Wild Videos0
Discrete neural representations for explainable anomaly detection0
Disentangle and denoise: Tackling context misalignment for video moment retrieval0
Distantly Supervised Semantic Text Detection and Recognition for Broadcast Sports Videos Understanding0
DLM-VMTL:A Double Layer Mapper for heterogeneous data video Multi-task prompt learning0
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition0
Show:102550
← PrevPage 26 of 46Next →

No leaderboard results yet.