SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 391400 of 1149 papers

TitleStatusHype
Learning Temporally Causal Latent Processes from General Temporal DataCode1
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language ModelsCode1
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task PerspectivesCode1
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerCode1
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long VideosCode1
CyberV: Cybernetics for Test-time Scaling in Video UnderstandingCode1
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer LearningCode1
TokenLearner: Adaptive Space-Time Tokenization for VideosCode1
Towards Long-Form Video UnderstandingCode1
VideoMamba: Spatio-Temporal Selective State Space ModelCode1
Show:102550
← PrevPage 40 of 115Next →

No leaderboard results yet.