SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–60 of 1149 papers

Title	Date	Tasks	Status	Hype
Threading Keyframe with Narratives: MLLMs as Strong Long Video Comprehenders	May 30, 2025	Video Understanding	—Unverified	0
SiLVR: A Simple Language-based Video Reasoning Framework	May 30, 2025	MathMME	CodeCode Available	1
DisTime: Distribution-based Time Representation for Video Large Language Models	May 30, 2025	Temporal LocalizationVideo Understanding	CodeCode Available	1
VUDG: A Dataset for Video Understanding Domain Generalization	May 30, 2025	Domain GeneralizationMultiple-choice	—Unverified	0
VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software	May 30, 2025	Question AnsweringSpatial Reasoning	CodeCode Available	1
Time Blindness: Why Video-Language Models Can't See What Humans Can?	May 30, 2025	Temporal SequencesVideo Understanding	—Unverified	0
Multi-RAG: A Multimodal Retrieval-Augmented Generation System for Adaptive Video Understanding	May 29, 2025	RAGRetrieval-augmented Generation	—Unverified	0
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection	May 29, 2025	image-classificationImage Classification	—Unverified	0
ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding	May 29, 2025	AvgVideo Understanding	CodeCode Available	0
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models	May 29, 2025	Self-Supervised LearningVideo Generation	CodeCode Available	2

Show:10 25 50

← PrevPage 6 of 115Next →

No leaderboard results yet.