SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 851860 of 1149 papers

TitleStatusHype
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary DetectionCode1
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search0
Prompting Visual-Language Models for Efficient Video UnderstandingCode1
Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation LearningCode0
Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips0
TokenLearner: Adaptive Space-Time Tokenization for VideosCode1
LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering0
UBoCo : Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection0
End-to-End Referring Video Object Segmentation with Multimodal TransformersCode1
SwinBERT: End-to-End Transformers with Sparse Attention for Video CaptioningCode1
Show:102550
← PrevPage 86 of 115Next →

No leaderboard results yet.