SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 501510 of 1149 papers

TitleStatusHype
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video UnderstandingCode0
Exploiting Long-Term Dependencies for Generating Dynamic Scene GraphsCode0
Multi-attention Networks for Temporal Localization of Video-level LabelsCode0
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video UnderstandingCode0
Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly VideosCode0
Multimodal Dialogue State TrackingCode0
EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric OptimizationCode0
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path PredictionCode0
Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action RecognitionCode0
Are current long-term video understanding datasets long-term?Code0
Show:102550
← PrevPage 51 of 115Next →

No leaderboard results yet.