SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 726750 of 1149 papers

TitleStatusHype
Future semantic segmentation of time-lapsed videos with large temporal displacement0
Gameplay Highlights Generation0
Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention0
Generating the Future With Adversarial Transformers0
Generating Videos with Scene Dynamics0
Generative Frame Sampler for Long Video Understanding0
Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning0
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning0
Global Motion Understanding in Large-Scale Video Object Segmentation0
Global Self-Attention Networks0
Global Self-Attention Networks for Image Recognition0
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding0
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation0
Gradient Frequency Modulation for Visually Explaining Video Understanding Models0
GraphVid: It Only Takes a Few Nodes to Understand a Video0
Grounded Objects and Interactions for Video Captioning0
Grounded Video Situation Recognition0
Grounding Action Descriptions in Videos0
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection0
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning0
GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement0
H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding0
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models0
Harnessing Object and Scene Semantics for Large-Scale Video Understanding0
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions0
Show:102550
← PrevPage 30 of 46Next →

No leaderboard results yet.