SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 801810 of 1149 papers

TitleStatusHype
Vamos: Versatile Action Models for Video UnderstandingCode0
SPOT! Revisiting Video-Language Models for Event Understanding0
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab0
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection0
Beyond still images: Temporal features and input variance resilience0
Videoprompter: an ensemble of foundational models for zero-shot video understanding0
Query-aware Long Video Localization and Relation Discrimination for Deep Video Understanding0
Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks0
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model0
Telling Stories for Common Sense Zero-Shot Action RecognitionCode0
Show:102550
← PrevPage 81 of 115Next →

No leaderboard results yet.