SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 8190 of 1149 papers

TitleStatusHype
A Challenge to Build Neuro-Symbolic Video AgentsCode0
Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language ModelsCode2
LoVR: A Benchmark for Long Video Retrieval in Multimodal ContextsCode1
Domain Adaptation of VLM for Soccer Video Understanding0
Breaking Down Video LLM Benchmarks: Knowledge, Spatial Perception, or True Temporal Understanding?0
VideoEval-Pro: Robust and Realistic Long Video Understanding EvaluationCode4
Temporal-Oriented Recipe for Transferring Large Vision-Language Model to Video UnderstandingCode0
From Shots to Stories: LLM-Assisted Video Editing with Unified Language Representations0
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language ModelsCode0
Show:102550
← PrevPage 9 of 115Next →

No leaderboard results yet.