SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 681–690 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search	Dec 9, 2021	Neural Architecture SearchVideo Recognition	—Unverified	0	0
AVD2: Accident Video Diffusion for Accident Video Description	Feb 20, 2025	Autonomous DrivingScene Understanding	—Unverified	0	0
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding	Mar 24, 2024	Dense Video CaptioningTemporal Localization	—Unverified	0	0
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs	Jun 5, 2025	BenchmarkingVideo Understanding	—Unverified	0	0
AVT: Audio-Video Transformer for Multimodal Action Recognition	Sep 22, 2022	Action RecognitionAudio Classification	—Unverified	0	0
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation	Aug 1, 2022	ObjectOptical Flow Estimation	—Unverified	0	0
BEARCUBS: A benchmark for computer-using web agents	Mar 10, 2025	Video Understanding	—Unverified	0	0
BERT for Large-scale Video Segment Classification with Test-time Augmentation	Dec 2, 2019	General ClassificationVideo Understanding	—Unverified	0	0
Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation	Jul 8, 2025	Depth EstimationDepth Prediction	—Unverified	0	0
Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection	Dec 6, 2024	GPUMulti-Object Tracking	—Unverified	0	0

Show:10 25 50

← PrevPage 69 of 115Next →

No leaderboard results yet.