SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 711720 of 1149 papers

TitleStatusHype
Causal Reasoning Meets Visual Representation Learning: A Prospective Study0
CAVALRY-V: A Large-Scale Generator Framework for Adversarial Attacks on Video MLLMs0
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding0
Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis0
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos0
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System0
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI0
CinePile: A Long Video Question Answering Dataset and Benchmark0
Clapper: Compact Learning and Video Representation in VLMs0
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation0
Show:102550
← PrevPage 72 of 115Next →

No leaderboard results yet.