SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10611070 of 1149 papers

TitleStatusHype
ReWind: Understanding Long Videos with Instructed Learnable Memory0
SA-NET.v2: Real-time vehicle detection from oblique UAV images with use of uncertainty estimation in deep meta-learning0
SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context0
Scene-centric Joint Parsing of Cross-view Videos0
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis0
SceneRAG: Scene-level Retrieval-Augmented Generation for Video Understanding0
MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization0
SEAL: Semantic Attention Learning for Long Video Representation0
Search-Map-Search: A Frame Selection Paradigm for Action Recognition0
Seed1.5-VL Technical Report0
Show:102550
← PrevPage 107 of 115Next →

No leaderboard results yet.