SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–825 of 1149 papers

Title	Date	Tasks	Status
Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges	Jul 2, 2025	Video Understanding	—Unverified
Large-Scale Video Classification with Feature Space Augmentation coupled with Learned Label Relations and Ensembling	Sep 21, 2018	General ClassificationVideo Classification	—Unverified
Large Scale Video Representation Learning via Relational Graph Clustering	Jun 1, 2020	ClusteringGraph Clustering	—Unverified
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks	Jun 14, 2017	ClassificationGeneral Classification	—Unverified
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision	Apr 15, 2023	Language ModelingLanguage Modelling	—Unverified
Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection	Aug 8, 2021	Action DetectionKnowledge Distillation	—Unverified
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval	Apr 3, 2025	Information RetrievalRepresentation Learning	—Unverified
Learning Dynamic MRI Reconstruction with Convolutional Network Assisted Reconstruction Swin Transformer	Sep 19, 2023	AnatomyComputational Efficiency	—Unverified
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking	Jun 7, 2021	Graph Neural NetworkMulti-Person Pose Estimation	—Unverified
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment	Jun 8, 2023	Video Understanding	—Unverified
Learning from Multiple Sources for Video Summarisation	Jan 13, 2015	ClusteringVideo Understanding	—Unverified
Learning Higher-order Object Interactions for Keypoint-based Video Understanding	May 16, 2023	Action LocalizationAction Recognition	—Unverified
Learning Object State Changes in Videos: An Open-World Perspective	Dec 19, 2023	Video Understanding	—Unverified
Learning reusable concepts across different egocentric video understanding tasks	May 30, 2025	Video Understanding	—Unverified
Learning Space-Time Semantic Correspondences	Jun 16, 2023	Imitation LearningSemantic correspondence	—Unverified
Learning text-to-video retrieval from image captioning	Apr 26, 2024	Image CaptioningImage Retrieval	—Unverified
Learning to Focus on the Foreground for Temporal Sentence Grounding	Oct 1, 2022	SentenceTemporal Sentence Grounding	—Unverified
Learning to Visually Connect Actions and their Effects	Jan 19, 2024	Object TrackingTask Planning	—Unverified
Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition	Jun 14, 2017	Action RecognitionOptical Flow Estimation	—Unverified
Less than Few: Self-Shot Video Instance Segmentation	Apr 19, 2022	Few-Shot LearningInstance Segmentation	—Unverified
Leveraging Foundation Models for Multimodal Graph-Based Action Recognition	May 21, 2025	Action RecognitionGraph Attention	—Unverified
Leveraging Local Temporal Information for Multimodal Scene Classification	Oct 26, 2021	ClassificationScene Classification	—Unverified
LIGAR: Lightweight General-purpose Action Recognition	Aug 30, 2021	Action RecognitionGesture Recognition	—Unverified
LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval	May 21, 2025	Autonomous DrivingQuestion Answering	—Unverified
LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering	Nov 29, 2021	DiversityQuestion Answering	—Unverified

Show:10 25 50

← PrevPage 33 of 46Next →

No leaderboard results yet.