SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 611620 of 1149 papers

TitleStatusHype
CinePile: A Long Video Question Answering Dataset and Benchmark0
Clapper: Compact Learning and Video Representation in VLMs0
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation0
CLIP4Caption: CLIP for Video Caption0
Co-attentional Transformers for Story-Based Video Understanding0
COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework0
CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding0
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization0
How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs0
Comprehensive Video Understanding: Video summarization with content-based video recommender design0
Show:102550
← PrevPage 62 of 115Next →

No leaderboard results yet.