SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 876900 of 1149 papers

TitleStatusHype
Impossible Videos0
Improving LLM Video Understanding with 16 Frames Per Second0
Improving Video Model Transfer With Dynamic Representation Learning0
Inductive Attention for Video Action Anticipation0
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding0
InstructionBench: An Instructional Video Understanding Benchmark0
Instrument-tissue Interaction Detection Framework for Surgical Video Understanding0
Integrated Object Detection and Tracking with Tracklet-Conditioned Detection0
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output0
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model0
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation0
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling0
InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model0
Interpretable Action Recognition on Hard to Classify Actions0
InterRVOS: Interaction-aware Referring Video Object Segmentation0
In-the-Wild Video Question Answering0
Inverse Compositional Learning for Weakly-supervised Relation Grounding0
IPAD: Industrial Process Anomaly Detection Dataset0
IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes0
IQViC: In-context, Question Adaptive Vision Compressor for Long-term Video Understanding LMMs0
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?0
Joint Engagement Classification using Video Augmentation Techniques for Multi-person Human-robot Interaction0
Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals0
Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input0
KeyVideoLLM: Towards Large-scale Video Keyframe Selection0
Show:102550
← PrevPage 36 of 46Next →

No leaderboard results yet.