SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 391–400 of 1149 papers

Title	Date	Tasks	Status	Hype
Is Appearance Free Action Recognition Possible?	Jul 13, 2022	Action RecognitionOptical Flow Estimation	CodeCode Available	1
Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach	May 10, 2023	Autonomous VehiclesMonocular Visual Odometry	CodeCode Available	1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1
Isolated Sign Recognition from RGB Video using Pose Flow and Self-Attention	Jun 11, 2021	Action RecognitionSign Language Recognition	CodeCode Available	1
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization	Aug 12, 2024	Action LocalizationTemporal Action Localization	CodeCode Available	1
CyberV: Cybernetics for Test-time Scaling in Video Understanding	Jun 9, 2025	Video Understanding	CodeCode Available	1
TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes	Feb 4, 2025	Autonomous DrivingMultiple-choice	CodeCode Available	1
Learning Self-Similarity in Space and Time as a Generalized Motion for Action Recognition	Jan 1, 2021	Action RecognitionVideo Understanding	CodeCode Available	1
MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing	Nov 24, 2021	audio-visual event localizationVideo Understanding	CodeCode Available	1
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models	Oct 30, 2024	Video Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 40 of 115Next →

No leaderboard results yet.