SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 726–750 of 1149 papers

Title	Date	Tasks	Status
Future semantic segmentation of time-lapsed videos with large temporal displacement	Dec 27, 2018	SegmentationSemantic Segmentation	—Unverified
Gameplay Highlights Generation	May 12, 2025	Event DetectionHighlight Detection	—Unverified
Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention	Apr 10, 2024	Action AnticipationGraph Neural Network	—Unverified
Generating the Future With Adversarial Transformers	Jul 1, 2017	Video Understanding	—Unverified
Generating Videos with Scene Dynamics	Sep 8, 2016	Action ClassificationFuture prediction	—Unverified
Generative Frame Sampler for Long Video Understanding	Mar 12, 2025	Video Understanding	—Unverified
Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning	Jun 1, 2018	Action RecognitionRepresentation Learning	—Unverified
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning	Dec 10, 2024	cross-modal alignmentVideo Understanding	—Unverified
Global Motion Understanding in Large-Scale Video Object Segmentation	May 11, 2024	Instance SegmentationOptical Flow Estimation	—Unverified
Global Self-Attention Networks	Jan 1, 2021	Video Understanding	—Unverified
Global Self-Attention Networks for Image Recognition	Oct 6, 2020	Video Understanding	—Unverified
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding	Jun 14, 2024	Activity RecognitionMMR total	—Unverified
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation	Nov 25, 2023	Instruction FollowingLanguage Modeling	—Unverified
Gradient Frequency Modulation for Visually Explaining Video Understanding Models	Nov 1, 2021	Action RecognitionTemporal Action Localization	—Unverified
GraphVid: It Only Takes a Few Nodes to Understand a Video	Jul 4, 2022	SuperpixelsVideo Understanding	—Unverified
Grounded Objects and Interactions for Video Captioning	Nov 16, 2017	ObjectScene Understanding	—Unverified
Grounded Video Situation Recognition	Oct 19, 2022	DescriptiveStructured Prediction	—Unverified
Grounding Action Descriptions in Videos	Jan 1, 2013	Semantic Textual SimilarityVideo Understanding	—Unverified
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection	Apr 20, 2025	Action DetectionDecoder	—Unverified
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning	Jun 19, 2025	Multimodal Reasoningreinforcement-learning	—Unverified
GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement	Jun 19, 2024	Video Understanding	—Unverified
H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding	Mar 31, 2025	Video Understanding	—Unverified
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models	Feb 28, 2025	Action UnderstandingText-to-Video Generation	—Unverified
Harnessing Object and Scene Semantics for Large-Scale Video Understanding	Jun 1, 2016	Action RecognitionClustering	—Unverified
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions	Sep 16, 2024	Dimensionality ReductionVideo Understanding	—Unverified

Show:10 25 50

← PrevPage 30 of 46Next →

No leaderboard results yet.