SOTAVerified|Agents Browse Leaderboard About

MME

MME is a comprehensive evaluation benchmark for multimodal large language models. It measures both perception and cognition abilities on a total of 14 subtasks, including existence, count, position, color, poster, celebrity, scene, landmark, artwork, OCR, commonsense reasoning, numerical calculation, text translation, and code reasoning.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 91–95 of 95 papers

Title	Date	Tasks	Status
Scalable K-Medoids via True Error Bound and Familywise Bandits	May 27, 2019	ClusteringMME	—Unverified
Silkie: Preference Distillation for Large Visual Language Models	Dec 17, 2023	HallucinationMME	—Unverified
Temporal Preference Optimization for Long-Form Video Understanding	Jan 23, 2025	FormMME	—Unverified
Temporal Reasoning Transfer from Text to Video	Oct 8, 2024	DiagnosticMME	—Unverified
The Use of Symmetry for Models with Variable-size Variables	Nov 15, 2023	MME	—Unverified

Show:10 25 50

← PrevPage 10 of 10Next →

No leaderboard results yet.