SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 271–280 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives	Feb 4, 2025	Video Understanding	CodeCode Available	1	5
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark	Aug 5, 2024	Dense Video CaptioningDiversity	CodeCode Available	1	5
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models	Mar 20, 2025	Multiple-choiceVideo Understanding	CodeCode Available	1	5
QuerYD: A video dataset with high-quality text and audio narrations	Nov 22, 2020	RetrievalVideo Understanding	CodeCode Available	1	5
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner	Jun 18, 2022	DecoderSemantic Segmentation	CodeCode Available	1	5
BehAVE: Behaviour Alignment of Video Game Encodings	Feb 2, 2024	DiversityFPS Games	CodeCode Available	1	5
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization	Aug 12, 2024	Action LocalizationTemporal Action Localization	CodeCode Available	1	5
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection	Dec 9, 2021	Boundary DetectionDiversity	CodeCode Available	1	5
Prompting Visual-Language Models for Efficient Video Understanding	Dec 8, 2021	Action RecognitionLanguage Modelling	CodeCode Available	1	5
A Simple LLM Framework for Long-Range Video Question-Answering	Dec 28, 2023	EgoSchemaLanguage Modelling	CodeCode Available	1	5

Show:10 25 50

← PrevPage 28 of 115Next →

No leaderboard results yet.