SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 626–650 of 1149 papers

Title	Date	Tasks	Status
Contrastive Language-Action Pre-training for Temporal Localization	Apr 26, 2022	Action LocalizationContrastive Learning	—Unverified
Contrastive Language Video Time Pre-training	Jun 4, 2024	Action RecognitionContrastive Learning	—Unverified
CoS: Chain-of-Shot Prompting for Long Video Understanding	Feb 10, 2025	Video Understanding	—Unverified
CRCL: Causal Representation Consistency Learning for Anomaly Detection in Surveillance Videos	Mar 24, 2025	Anomaly DetectionAnomaly Detection In Surveillance Videos	—Unverified
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization	Jan 1, 2021	Action LocalizationVideo Understanding	—Unverified
Cross-Class Relevance Learning for Temporal Concept Localization	Nov 19, 2019	Feature EngineeringVideo Understanding	—Unverified
CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding	Jan 17, 2024	Contrastive Learningpoint cloud video understanding	—Unverified
CTM: Collaborative Temporal Modeling for Action Recognition	Feb 8, 2020	Action RecognitionVideo Understanding	—Unverified
Cultivating DNN Diversity for Large Scale Video Labelling	Jul 13, 2017	DiversityVideo Understanding	—Unverified
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data	Jan 17, 2020	Graph LearningVideo Understanding	—Unverified
Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model	Jan 29, 2024	Action DetectionAction Localization	—Unverified
Cycle-Contrast for Self-Supervised Video Representation Learning	Oct 28, 2020	Action RecognitionContrastive Learning	—Unverified
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description	Mar 31, 2025	Video DescriptionVideo Understanding	—Unverified
Deep learning for action spotting in association football videos	Oct 2, 2024	Action SpottingBenchmarking	—Unverified
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation	Jul 3, 2018	Instance SegmentationSemantic Segmentation	—Unverified
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding	May 23, 2025	FormQuestion Answering	—Unverified
DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding	May 19, 2018	Action Recognition In VideosGesture Recognition	—Unverified
Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection	Jul 29, 2020	object-detectionObject Detection	—Unverified
Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding	Jun 1, 2022	Knowledge GraphsVideo Understanding	—Unverified
Discerning Generic Event Boundaries in Long-Form Wild Videos	Jun 18, 2021	Boundary DetectionForm	—Unverified
Discrete neural representations for explainable anomaly detection	Dec 10, 2021	Anomaly DetectionObject	—Unverified
Disentangle and denoise: Tackling context misalignment for video moment retrieval	Aug 14, 2024	DenoisingDisentanglement	—Unverified
Distantly Supervised Semantic Text Detection and Recognition for Broadcast Sports Videos Understanding	Oct 31, 2021	Action RecognitionText Detection	—Unverified
DLM-VMTL:A Double Layer Mapper for heterogeneous data video Multi-task prompt learning	Aug 29, 2024	Multi-Task LearningPrompt Learning	—Unverified
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition	Jan 11, 2019	Action ClassificationAction Recognition	—Unverified

Show:10 25 50

← PrevPage 26 of 46Next →

No leaderboard results yet.