SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 611–620 of 1149 papers

Title	Date	Tasks	Status	Hype
CinePile: A Long Video Question Answering Dataset and Benchmark	May 14, 2024	FormHuman-Object Interaction Detection	—Unverified	0
Clapper: Compact Learning and Video Representation in VLMs	May 21, 2025	Video Understanding	—Unverified	0
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation	Mar 19, 2021	ObjectReferring Expression Segmentation	—Unverified	0
CLIP4Caption: CLIP for Video Caption	Oct 13, 2021	DecoderSentence	—Unverified	0
Co-attentional Transformers for Story-Based Video Understanding	Oct 27, 2020	Question AnsweringVideo Question Answering	—Unverified	0
COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework	Dec 11, 2024	GPULanguage Modeling	—Unverified	0
CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding	Jul 21, 2021	Question AnsweringSentence	—Unverified	0
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization	Mar 22, 2025	Saliency DetectionSentence	—Unverified	0
How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs	May 6, 2024	Autonomous VehiclesVideo Understanding	—Unverified	0
Comprehensive Video Understanding: Video summarization with content-based video recommender design	Oct 30, 2019	Action RecognitionData Augmentation	—Unverified	0

Show:10 25 50

← PrevPage 62 of 115Next →

No leaderboard results yet.