SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 241–250 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer	Apr 29, 2023	DecoderHighlight Detection	CodeCode Available	1	5
MECD+: Unlocking Event-Level Causal Graph Discovery for Video Reasoning	Jan 13, 2025	Causal DiscoveryCausal Inference	CodeCode Available	1	5
A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action Detector	Jun 7, 2022	Action ClassificationAction Detection	CodeCode Available	1	5
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning	Jun 27, 2022	Action ClassificationAction Recognition	CodeCode Available	1	5
MMAD: Multi-label Micro-Action Detection in Videos	Jul 7, 2024	Action AnalysisAction Detection	CodeCode Available	1	5
Grounded Question-Answering in Long Egocentric Videos	Dec 11, 2023	Video GroundingVideo Question Answering	CodeCode Available	1	5
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1	5
Panoptic Video Scene Graph Generation	Nov 28, 2023	Graph GenerationPanoptic Scene Graph Generation	CodeCode Available	1	5
PAVE: Patching and Adapting Video Large Language Models	Mar 25, 2025	Audio-visual Question AnsweringMulti-Task Learning	CodeCode Available	1	5
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding	Jul 30, 2022	point cloud video understandingVideo Understanding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 25 of 115Next →

No leaderboard results yet.