SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 961–970 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Measure Twice, Cut Once: Grasping Video Structures and Event Semantics with LLMs for Video Temporal Localization	Mar 12, 2025	Temporal LocalizationVideo Understanding	—Unverified	0	0
Memory Consolidation Enables Long-Context Video Understanding	Feb 8, 2024	EgoSchemaVideo Understanding	—Unverified	0	0
Memory-enhanced Retrieval Augmentation for Long Video Understanding	Mar 12, 2025	RAGRetrieval	—Unverified	0	0
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding	Jan 3, 2022	SentenceTemporal Sentence Grounding	—Unverified	0	0
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD	Jun 11, 2024	Video RecognitionVideo Understanding	—Unverified	0	0
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound	Jan 7, 2022	Action ClassificationNavigate	—Unverified	0	0
Mid-level Representation for Visual Recognition	Dec 23, 2015	object-detectionObject Detection	—Unverified	0	0
Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain	Nov 19, 2019	Action RecognitionVideo Recognition	—Unverified	0	0
M-LLM Based Video Frame Selection for Efficient Video Understanding	Feb 27, 2025	EgoSchemaLanguage Modeling	—Unverified	0	0
MLVTG: Mamba-Based Feature Alignment and LLM-Driven Purification for Multi-Modal Video Temporal Grounding	Jun 10, 2025	Language ModelingLanguage Modelling	—Unverified	0	0

Show:10 25 50

← PrevPage 97 of 115Next →

No leaderboard results yet.