SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 961–970 of 1149 papers

Title	Date	Tasks	Status	Hype
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos	Jul 22, 2020	Action RecognitionTemporal Action Localization	—Unverified	0
Video Domain Incremental Learning for Human Action Recognition in Home Environments	Dec 22, 2024	Action Recognitionclass-incremental learning	—Unverified	0
Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models	Jul 8, 2025	Future predictionLarge Language Model	—Unverified	0
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding	Apr 10, 2025	Instruction FollowingVideo Understanding	—Unverified	0
VideoGLUE: Video General Understanding Evaluation of Foundation Models	Jul 6, 2023	Action RecognitionTemporal Localization	—Unverified	0
Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding	Dec 31, 2023	Spatio-Temporal Video GroundingVideo Grounding	—Unverified	0
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding	Jan 1, 2024	Spatio-Temporal Video GroundingVideo Grounding	—Unverified	0
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models	Jun 24, 2024	HallucinationVideo Understanding	—Unverified	0
VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding	Jul 17, 2025	Video GroundingVideo Understanding	—Unverified	0
Video Language Model Pretraining with Spatio-temporal Masking	Jan 1, 2025	DecoderLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 97 of 115Next →

No leaderboard results yet.