SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 961970 of 1149 papers

TitleStatusHype
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos0
Video Domain Incremental Learning for Human Action Recognition in Home Environments0
Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models0
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding0
VideoGLUE: Video General Understanding Evaluation of Foundation Models0
Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding0
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding0
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models0
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding0
Video Language Model Pretraining with Spatio-temporal Masking0
Show:102550
← PrevPage 97 of 115Next →

No leaderboard results yet.