Video MME

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–26 of 26 papers

Title	Date	Tasks	Status	Hype
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation	May 20, 2025	MMEMultiple-choice	CodeCode Available	4
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models	Apr 21, 2025	MMEVideo MME	CodeCode Available	4
Long Context Transfer from Language to Vision	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams	Jun 30, 2025	cross-modal alignmentEgoSchema	CodeCode Available	3
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos	Apr 24, 2025	MMEVideo MME	CodeCode Available	3
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition	Dec 12, 2024	EgoSchema	CodeCode Available	3
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension	Nov 20, 2024	GPUMME	CodeCode Available	3
VideoDeepResearch: Long Video Understanding With Agentic Tool Using	Jun 12, 2025	MMEVideo MME	CodeCode Available	2
SpaceR: Reinforcing MLLMs in Video Spatial Reasoning	Apr 2, 2025	MMESpatial Reasoning	CodeCode Available	2
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension	Mar 11, 2025	AutoMLDecoder	CodeCode Available	2
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos	May 29, 2024	EgoSchemaMME	CodeCode Available	2
SiLVR: A Simple Language-based Video Reasoning Framework	May 30, 2025	MathMME	CodeCode Available	1
FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding	Apr 24, 2025	document understandingMME	CodeCode Available	1
BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding	Mar 27, 2025	FormLanguage Modeling	CodeCode Available	1
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis	May 31, 2024	MMEVideo MME	CodeCode Available	1
Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs	Jun 27, 2025	MMEVideo MME	—Unverified	0
DynTok: Dynamic Compression of Visual Tokens for Efficient and Effective Video Understanding	Jun 4, 2025	MMEVideo MME	—Unverified	0
An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes	Apr 21, 2025	MMEVideo MME	—Unverified	0
Improving LLM Video Understanding with 16 Frames Per Second	Mar 18, 2025	MMEVideo MME	—Unverified	0
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding	Mar 17, 2025	AttributeMME	—Unverified	0
Temporal Preference Optimization for Long-Form Video Understanding	Jan 23, 2025	FormMME	—Unverified	0
Apollo: An Exploration of Video Understanding in Large Multimodal Models	Dec 13, 2024	MMEVideo MME	—Unverified	0
SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context	Nov 25, 2024	Large Language ModelMME	—Unverified	0
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification	Oct 11, 2024	MMEQuantization	—Unverified	0
Temporal Reasoning Transfer from Text to Video	Oct 8, 2024	DiagnosticMME	—Unverified	0
DrVideo: Document Retrieval Based Long Video Understanding	Jun 18, 2024	document understandingEgoSchema	—Unverified	0

Show:10 25 50

No leaderboard results yet.