SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 526550 of 1149 papers

TitleStatusHype
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric VisionCode0
LLaVA-OneVision: Easy Visual Task TransferCode0
METok: Multi-Stage Event-based Token Compression for Efficient Long Video UnderstandingCode0
MINOTAUR: Multi-task Video Grounding From Multimodal QueriesCode0
Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022Code0
A Challenge to Build Neuro-Symbolic Video AgentsCode0
Representation Flow for Action RecognitionCode0
Learning to Visually Connect Actions and their Effects0
Learning to Focus on the Foreground for Temporal Sentence Grounding0
Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey0
Learning text-to-video retrieval from image captioning0
Learning Space-Time Semantic Correspondences0
An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform0
Learning reusable concepts across different egocentric video understanding tasks0
EAGLE: Egocentric AGgregated Language-video Engine0
Learning Object State Changes in Videos: An Open-World Perspective0
Learning Higher-order Object Interactions for Keypoint-based Video Understanding0
Learning from Multiple Sources for Video Summarisation0
DynTok: Dynamic Compression of Visual Tokens for Efficient and Effective Video Understanding0
BioVL-QR: Egocentric Biochemical Vision-and-Language Dataset Using Micro QR Codes0
An Attempt towards Interpretable Audio-Visual Video Captioning0
AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction0
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment0
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking0
DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding0
Show:102550
← PrevPage 22 of 46Next →

No leaderboard results yet.