SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 131–140 of 1149 papers

Title	Date	Tasks	Status	Hype
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding	Jul 31, 2023	Multiple-choiceQuestion Answering	CodeCode Available	2
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding	Jul 22, 2024	Multiple-choiceQuestion Answering	CodeCode Available	2
LongVLM: Efficient Long Video Understanding via Large Language Models	Apr 4, 2024	Question AnsweringVideo Question Answering	CodeCode Available	2
LVBench: An Extreme Long Video Understanding Benchmark	Jun 12, 2024	Decision MakingVideo Understanding	CodeCode Available	2
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark	May 30, 2024	DeepFake DetectionMamba	CodeCode Available	2
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs	Jun 27, 2025	Question AnsweringVideo Question Answering	CodeCode Available	2
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Jan 21, 2025	Video Understanding	CodeCode Available	2
Leveraging Temporal Contextualization for Video Action Recognition	Apr 15, 2024	Action RecognitionTemporal Action Localization	CodeCode Available	2
LinVT: Empower Your Image-level Large Language Model to Understand Videos	Dec 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	2

Show:10 25 50

← PrevPage 14 of 115Next →

No leaderboard results yet.