SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 131140 of 1149 papers

TitleStatusHype
MovieChat: From Dense Token to Sparse Memory for Long Video UnderstandingCode2
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language UnderstandingCode2
LongVLM: Efficient Long Video Understanding via Large Language ModelsCode2
LVBench: An Extreme Long Video Understanding BenchmarkCode2
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo BenchmarkCode2
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMsCode2
MMVU: Measuring Expert-Level Multi-Discipline Video UnderstandingCode2
Leveraging Temporal Contextualization for Video Action RecognitionCode2
LinVT: Empower Your Image-level Large Language Model to Understand VideosCode2
Show:102550
← PrevPage 14 of 115Next →

No leaderboard results yet.