SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 391–400 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Learning Temporally Causal Latent Processes from General Temporal Data	Oct 11, 2021	Causal DiscoveryRepresentation Learning	CodeCode Available	1	5
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models	Mar 20, 2025	Multiple-choiceVideo Understanding	CodeCode Available	1	5
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives	Feb 4, 2025	Video Understanding	CodeCode Available	1	5
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer	Apr 29, 2023	DecoderHighlight Detection	CodeCode Available	1	5
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos	Mar 9, 2025	Action LocalizationBoundary Detection	CodeCode Available	1	5
CyberV: Cybernetics for Test-time Scaling in Video Understanding	Jun 9, 2025	Video Understanding	CodeCode Available	1	5
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning	Nov 27, 2023	Action ClassificationAction Recognition	CodeCode Available	1	5
TokenLearner: Adaptive Space-Time Tokenization for Videos	Dec 1, 2021	Representation LearningVideo Recognition	CodeCode Available	1	5
Towards Long-Form Video Understanding	Jun 21, 2021	Action RecognitionForm	CodeCode Available	1	5
VideoMamba: Spatio-Temporal Selective State Space Model	Jul 11, 2024	Mambamodel	CodeCode Available	1	5

Show:10 25 50

← PrevPage 40 of 115Next →

No leaderboard results yet.