SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–210 of 1149 papers

Title	Date	Tasks	Status	Hype
BEARCUBS: A benchmark for computer-using web agents	Mar 10, 2025	Video Understanding	—Unverified	0
ALLVB: All-in-One Long Video Understanding Benchmark	Mar 10, 2025	AllVideo Understanding	—Unverified	0
Towards Fine-Grained Video Question Answering	Mar 10, 2025	Language ModelingLanguage Modelling	—Unverified	0
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos	Mar 9, 2025	Action LocalizationBoundary Detection	CodeCode Available	1
Unified Reward Model for Multimodal Understanding and Generation	Mar 7, 2025	Image Generationmodel	CodeCode Available	4
EgoLife: Towards Egocentric Life Assistant	Mar 5, 2025	Question AnsweringVideo Understanding	CodeCode Available	3
Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection	Mar 5, 2025	Anomaly DetectionObject	—Unverified	0
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning	Mar 2, 2025	Large Language ModelMulti-Instance Retrieval	CodeCode Available	1
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models	Feb 28, 2025	Action UnderstandingText-to-Video Generation	—Unverified	0
PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style Videos	Feb 28, 2025	Question AnsweringVideo Understanding	—Unverified	0

Show:10 25 50

← PrevPage 21 of 115Next →

No leaderboard results yet.