SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1131–1140 of 1149 papers

Title	Date	Tasks	Status	Hype
Audio Caption in a Car Setting with a Sentence-Level Loss	May 31, 2019	Audio captioningDecoder	CodeCode Available	0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models	May 13, 2025	FormMultiple-choice	CodeCode Available	0
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains	Dec 8, 2019	Action RecognitionData Augmentation	CodeCode Available	0
Detect-and-Track: Efficient Pose Estimation in Videos	Dec 26, 2017	Human DetectionKeypoint Estimation	CodeCode Available	0
MINOTAUR: Multi-task Video Grounding From Multimodal Queries	Feb 16, 2023	Action DetectionSentence	CodeCode Available	0
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding	Jun 16, 2025	Optical Character Recognition (OCR)RAG	CodeCode Available	0
Deep Learning Methods for Efficient Large Scale Video Labeling	Jun 14, 2017	Deep LearningVideo Understanding	CodeCode Available	0
Creative Flow+ Dataset	Jun 1, 2019	3D Character Animation From A Single PhotoDepth Estimation	CodeCode Available	0
Contextual Explainable Video Representation: Human Perception-based Understanding	Dec 12, 2022	Action DetectionAction Recognition	CodeCode Available	0
A Challenge to Build Neuro-Symbolic Video Agents	May 20, 2025	Scene ClassificationVideo Retrieval	CodeCode Available	0

Show:10 25 50

← PrevPage 114 of 115Next →

No leaderboard results yet.