SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 511520 of 1149 papers

TitleStatusHype
PVChat: Personalized Video Chat with One-Shot Learning0
Temporal Action Detection Model Compression by Progressive Block Drop0
DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question Answering0
What can Off-the-Shelves Large Multi-Modal Models do for Dynamic Scene Graph Generation?0
MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability0
Improving LLM Video Understanding with 16 Frames Per Second0
Impossible Videos0
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition0
Show:102550
← PrevPage 52 of 115Next →

No leaderboard results yet.