SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 741–750 of 1149 papers

Title	Date	Tasks	Status	Hype
Grounded Objects and Interactions for Video Captioning	Nov 16, 2017	ObjectScene Understanding	—Unverified	0
Grounded Video Situation Recognition	Oct 19, 2022	DescriptiveStructured Prediction	—Unverified	0
Grounding Action Descriptions in Videos	Jan 1, 2013	Semantic Textual SimilarityVideo Understanding	—Unverified	0
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection	Apr 20, 2025	Action DetectionDecoder	—Unverified	0
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning	Jun 19, 2025	Multimodal Reasoningreinforcement-learning	—Unverified	0
GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement	Jun 19, 2024	Video Understanding	—Unverified	0
H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding	Mar 31, 2025	Video Understanding	—Unverified	0
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models	Feb 28, 2025	Action UnderstandingText-to-Video Generation	—Unverified	0
Harnessing Object and Scene Semantics for Large-Scale Video Understanding	Jun 1, 2016	Action RecognitionClustering	—Unverified	0
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions	Sep 16, 2024	Dimensionality ReductionVideo Understanding	—Unverified	0

Show:10 25 50

← PrevPage 75 of 115Next →

No leaderboard results yet.