SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 1149 papers

Title	Date	Tasks	Status
Impossible Videos	Mar 18, 2025	counterfactualVideo Generation	—Unverified
Improving LLM Video Understanding with 16 Frames Per Second	Mar 18, 2025	MMEVideo MME	—Unverified
Improving Video Model Transfer With Dynamic Representation Learning	Jan 1, 2022	Action ClassificationKnowledge Distillation	—Unverified
Inductive Attention for Video Action Anticipation	Dec 17, 2022	Action AnticipationAction Recognition	—Unverified
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding	Jun 18, 2025	GPUStreaming video understanding	—Unverified
InstructionBench: An Instructional Video Understanding Benchmark	Apr 7, 2025	Common Sense ReasoningMultiple-choice	—Unverified
Instrument-tissue Interaction Detection Framework for Surgical Video Understanding	Mar 30, 2024	Video Understanding	—Unverified
Integrated Object Detection and Tracking with Tracklet-Conditioned Detection	Nov 27, 2018	Objectobject-detection	—Unverified
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output	Jul 3, 2024	ArticlesImage Comprehension	—Unverified
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model	Jan 21, 2025	Instruction FollowingMathematical Reasoning	—Unverified
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation	Jul 13, 2023	Action RecognitionContrastive Learning	—Unverified
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling	Jan 21, 2025	Object TrackingReferring Expression Segmentation	—Unverified
InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model	Feb 26, 2025	Video Quality AssessmentVideo Understanding	—Unverified
Interpretable Action Recognition on Hard to Classify Actions	Sep 19, 2024	Action RecognitionDepth Estimation	—Unverified
InterRVOS: Interaction-aware Referring Video Object Segmentation	Jun 3, 2025	8kObject	—Unverified
In-the-Wild Video Question Answering	Oct 1, 2022	Evidence SelectionQuestion Answering	—Unverified
Inverse Compositional Learning for Weakly-supervised Relation Grounding	Jan 1, 2023	RelationVideo Understanding	—Unverified
IPAD: Industrial Process Anomaly Detection Dataset	Apr 23, 2024	Anomaly DetectionVideo Anomaly Detection	—Unverified
IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes	Jun 26, 2025	AttributeQuestion Answering	—Unverified
IQViC: In-context, Question Adaptive Vision Compressor for Long-term Video Understanding LMMs	Dec 13, 2024	Question AnsweringVideo Question Answering	—Unverified
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?	Apr 2, 2025	Action RecognitionAll	—Unverified
Joint Engagement Classification using Video Augmentation Techniques for Multi-person Human-robot Interaction	Dec 28, 2022	Data AugmentationFace Swapping	—Unverified
Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals	Jul 1, 2017	Video Understanding	—Unverified
Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input	Aug 28, 2024	Language ModelingLanguage Modelling	—Unverified
KeyVideoLLM: Towards Large-scale Video Keyframe Selection	Jul 3, 2024	Data CompressionManagement	—Unverified

Show:10 25 50

← PrevPage 36 of 46Next →

No leaderboard results yet.