SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 121130 of 1149 papers

TitleStatusHype
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language UnderstandingCode2
MovieChat: From Dense Token to Sparse Memory for Long Video UnderstandingCode2
MVBench: A Comprehensive Multi-modal Video Understanding BenchmarkCode2
MMVU: Measuring Expert-Level Multi-Discipline Video UnderstandingCode2
Adaptive Keyframe Sampling for Long Video UnderstandingCode2
Mobile-VideoGPT: Fast and Accurate Video Understanding Language ModelCode2
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMsCode2
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video UnderstandingCode2
LVBench: An Extreme Long Video Understanding BenchmarkCode2
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language UnderstandingCode2
Show:102550
← PrevPage 13 of 115Next →

No leaderboard results yet.