SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 241250 of 1149 papers

TitleStatusHype
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerCode1
MECD+: Unlocking Event-Level Causal Graph Discovery for Video ReasoningCode1
A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action DetectorCode1
ST-Adapter: Parameter-Efficient Image-to-Video Transfer LearningCode1
MMAD: Multi-label Micro-Action Detection in VideosCode1
Grounded Question-Answering in Long Egocentric VideosCode1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video UnderstandingCode1
Panoptic Video Scene Graph GenerationCode1
PAVE: Patching and Adapting Video Large Language ModelsCode1
Point Primitive Transformer for Long-Term 4D Point Cloud Video UnderstandingCode1
Show:102550
← PrevPage 25 of 115Next →

No leaderboard results yet.