SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 531–540 of 1149 papers

Title	Date	Tasks	Status	Hype
Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis	May 28, 2021	Multimodal Sentiment AnalysisObject Recognition	—Unverified	0
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding	May 23, 2025	FormQuestion Answering	—Unverified	0
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search	Dec 9, 2021	Neural Architecture SearchVideo Recognition	—Unverified	0
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD	Jun 11, 2024	Video RecognitionVideo Understanding	—Unverified	0
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding	Dec 5, 2023	DiversityGraph Generation	—Unverified	0
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation	Jul 3, 2018	Instance SegmentationSemantic Segmentation	—Unverified	0
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding	Jan 1, 2025	Question AnsweringVideo Understanding	—Unverified	0
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training	Jul 5, 2020	DecoderQuestion Answering	—Unverified	0
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark	Oct 4, 2024	Image CaptioningVideo Understanding	—Unverified	0
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions	May 28, 2024	Action RecognitionVideo Recognition	—Unverified	0

Show:10 25 50

← PrevPage 54 of 115Next →

No leaderboard results yet.