SOTAVerified|Agents Browse Leaderboard About Blog

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 1149 papers

Title	Date	Tasks	Status	Hype
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1	Mar 31, 2025	Logical ReasoningMultiple-choice	CodeCode Available	2
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model	Mar 27, 2025	EgoSchemaLanguage Modeling	CodeCode Available	2
ViSpeak: Visual Instruction Feedback in Streaming Videos	Mar 17, 2025	Streaming video understandingVideo Understanding	CodeCode Available	2
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding	Mar 16, 2025	Video Understanding	CodeCode Available	2
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension	Mar 11, 2025	AutoMLDecoder	CodeCode Available	2
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding	Feb 15, 2025	Question AnsweringStreaming video understanding	CodeCode Available	2
AIN: The Arabic INclusive Large Multimodal Model	Jan 31, 2025	document understandingmodel	CodeCode Available	2
TinyLLaVA-Video: A Simple Framework of Small-scale Large Multimodal Models for Video Understanding	Jan 26, 2025	Video Understanding	CodeCode Available	2
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge	Jan 23, 2025	SchedulingStreaming video understanding	CodeCode Available	2
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Jan 21, 2025	Video Understanding	CodeCode Available	2
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?	Jan 9, 2025	BenchmarkingVideo Understanding	CodeCode Available	2
Adaptive Keyframe Sampling for Long Video Understanding	Jan 1, 2025	Video Understanding	CodeCode Available	2
Online Video Understanding: OVBench and VideoChat-Online	Dec 31, 2024	Autonomous DrivingQuestion Answering	CodeCode Available	2
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models	Dec 30, 2024	Question AnsweringToken Reduction	CodeCode Available	2
PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Dec 20, 2024	Video Understanding	CodeCode Available	2
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models	Dec 18, 2024	Reasoning SegmentationSegmentation	CodeCode Available	2
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition	Dec 15, 2024	Computational EfficiencyVideo Recognition	CodeCode Available	2
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2
LinVT: Empower Your Image-level Large Language Model to Understand Videos	Dec 6, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning	Dec 4, 2024	Video Understanding	CodeCode Available	2
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos	Nov 29, 2024	Boundary DetectionDense Video Captioning	CodeCode Available	2
TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability	Nov 27, 2024	Temporal LocalizationVideo Understanding	CodeCode Available	2
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding	Nov 6, 2024	Image ComprehensionStreaming video understanding	CodeCode Available	2
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance	Nov 4, 2024	Caption GenerationMultiple-choice	CodeCode Available	2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning	Oct 25, 2024	EgoSchemaHallucination	CodeCode Available	2

Show:10 25 50

← PrevPage 4 of 46Next →

No leaderboard results yet.