SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 711720 of 1149 papers

TitleStatusHype
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding0
Towards Holistic Language-video Representation: the language model-enhanced MSR-Video to Text Dataset0
GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement0
DrVideo: Document Retrieval Based Long Video Understanding0
Hallucination Mitigation Prompts Long-term Video UnderstandingCode0
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment0
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal ModelCode0
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding0
Localizing Events in Videos with Multimodal Queries0
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living0
Show:102550
← PrevPage 72 of 115Next →

No leaderboard results yet.