SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–160 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Grounded Question-Answering in Long Egocentric Videos	Dec 11, 2023	Video GroundingVideo Question Answering	CodeCode Available	1	5
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization	Aug 12, 2024	Action LocalizationTemporal Action Localization	CodeCode Available	1	5
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding	Sep 27, 2024	Video UnderstandingVisual Reasoning	CodeCode Available	1	5
A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions	Apr 21, 2022	Action DetectionVideo Understanding	CodeCode Available	1	5
LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts	May 20, 2025	Caption GenerationRetrieval	CodeCode Available	1	5
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization	Jun 14, 2020	Action DetectionAction Localization	CodeCode Available	1	5
From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering	May 30, 2022	counterfactualDescriptive	CodeCode Available	1	5
A Multigrid Method for Efficiently Training Video Models	Dec 2, 2019	Action DetectionAction Recognition	CodeCode Available	1	5
From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities	Jan 10, 2025	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1	5
Long Movie Clip Classification with State-Space Video Models	Apr 4, 2022	ClassificationDecoder	CodeCode Available	1	5

Show:10 25 50

← PrevPage 16 of 115Next →

No leaderboard results yet.