SOTAVerified|Agents Browse Leaderboard About

MME

MME is a comprehensive evaluation benchmark for multimodal large language models. It measures both perception and cognition abilities on a total of 14 subtasks, including existence, count, position, color, poster, celebrity, scene, landmark, artwork, OCR, commonsense reasoning, numerical calculation, text translation, and code reasoning.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–60 of 95 papers

Title	Date	Tasks	Status
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding	Mar 17, 2025	AttributeMME	—Unverified
Re-Imagining Multimodal Instruction Tuning: A Representation View	Mar 2, 2025	Instruction FollowingMME	CodeCode Available
Ultra-High-Frequency Harmony: mmWave Radar and Event Camera Orchestrate Accurate Drone Landing	Feb 20, 2025	MMESensor Fusion	—Unverified
AIDE: Agentically Improve Visual Language Model with Domain Experts	Feb 13, 2025	Knowledge DistillationLanguage Modeling	—Unverified
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	Feb 13, 2025	BenchmarkingMath	—Unverified
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment	Feb 7, 2025	DiversityHuman-Object Interaction Detection	—Unverified
Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding	Feb 3, 2025	AttributeMME	—Unverified
MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark	Jan 28, 2025	MMEModel Optimization	—Unverified
Temporal Preference Optimization for Long-Form Video Understanding	Jan 23, 2025	FormMME	—Unverified
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules	Dec 24, 2024	MMESensitivity	CodeCode Available

Show:10 25 50

← PrevPage 6 of 10Next →

No leaderboard results yet.