SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 521530 of 1149 papers

TitleStatusHype
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding0
FE-Adapter: Adapting Image-based Emotion Classifiers to Videos0
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding0
DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding0
AVD2: Accident Video Diffusion for Accident Video Description0
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?0
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos0
Hierarchical Video Frame Sequence Representation with Deep Convolutional Graph Network0
Memory Consolidation Enables Long-Context Video Understanding0
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding0
Show:102550
← PrevPage 53 of 115Next →

No leaderboard results yet.