MME

MME is a comprehensive evaluation benchmark for multimodal large language models. It measures both perception and cognition abilities on a total of 14 subtasks, including existence, count, position, color, poster, celebrity, scene, landmark, artwork, OCR, commonsense reasoning, numerical calculation, text translation, and code reasoning.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–95 of 95 papers

Title	Date	Tasks	Status
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding	Mar 17, 2025	AttributeMME	—Unverified
Re-Imagining Multimodal Instruction Tuning: A Representation View	Mar 2, 2025	Instruction FollowingMME	CodeCode Available
Ultra-High-Frequency Harmony: mmWave Radar and Event Camera Orchestrate Accurate Drone Landing	Feb 20, 2025	MMESensor Fusion	—Unverified
AIDE: Agentically Improve Visual Language Model with Domain Experts	Feb 13, 2025	Knowledge DistillationLanguage Modeling	—Unverified
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	Feb 13, 2025	BenchmarkingMath	—Unverified
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment	Feb 7, 2025	DiversityHuman-Object Interaction Detection	—Unverified
Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding	Feb 3, 2025	AttributeMME	—Unverified
MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark	Jan 28, 2025	MMEModel Optimization	—Unverified
Temporal Preference Optimization for Long-Form Video Understanding	Jan 23, 2025	FormMME	—Unverified
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules	Dec 24, 2024	MMESensitivity	CodeCode Available
GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors	Dec 19, 2024	MME	—Unverified
Apollo: An Exploration of Video Understanding in Large Multimodal Models	Dec 13, 2024	MMEVideo MME	—Unverified
EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation	Dec 6, 2024	MMEQuestion Answering	—Unverified
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads	Nov 28, 2024	GPULanguage Modeling	—Unverified
SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context	Nov 25, 2024	Large Language ModelMME	—Unverified
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy	Nov 23, 2024	Instruction FollowingMME	—Unverified
The economic value of empowering older patients transitioning from hospital to home: Evidence from the 'Your Care Needs You' intervention	Nov 7, 2024	MMESensitivity	—Unverified
MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning	Nov 5, 2024	MMEQuestion Answering	—Unverified
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification	Oct 11, 2024	MMEQuantization	—Unverified
Temporal Reasoning Transfer from Text to Video	Oct 8, 2024	DiagnosticMME	—Unverified
DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination	Oct 6, 2024	AttributeDecoder	—Unverified
TUBench: Benchmarking Large Vision-Language Models on Trustworthiness with Unanswerable Questions	Oct 5, 2024	BenchmarkingHallucination	CodeCode Available
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?	Aug 23, 2024	MME	—Unverified
Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment	Jul 21, 2024	MME	CodeCode Available
DrVideo: Document Retrieval Based Long Video Understanding	Jun 18, 2024	document understandingEgoSchema	—Unverified
RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language Models	May 28, 2024	HallucinationMME	—Unverified
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models	May 28, 2024	MMEObject	—Unverified
Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models	Apr 6, 2024	MMEObject	CodeCode Available
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise	Dec 19, 2023	MMEVisual Reasoning	—Unverified
Silkie: Preference Distillation for Large Visual Language Models	Dec 17, 2023	HallucinationMME	—Unverified
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions	Nov 21, 2023	DescriptiveMME	CodeCode Available
The Use of Symmetry for Models with Variable-size Variables	Nov 15, 2023	MME	—Unverified
Enhancing the Spatial Awareness Capability of Multi-Modal Large Language Model	Oct 31, 2023	Autonomous DrivingLanguage Modeling	—Unverified
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors	Sep 29, 2023	BenchmarkingComputational Efficiency	—Unverified
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition	Sep 26, 2023	ArticlesImage Comprehension	CodeCode Available
Domain Adaptation via Minimax Entropy for Real/Bogus Classification of Astronomical Alerts	Aug 15, 2023	AstronomyDomain Adaptation	—Unverified
Multi-Modal Evaluation Approach for Medical Image Segmentation	Feb 8, 2023	Image SegmentationMedical Image Segmentation	—Unverified
MAAL: Multimodality-Aware Autoencoder-Based Affordance Learning for 3D Articulated Objects	Jan 1, 2023	MMEObject	CodeCode Available
MM-GNN: Mix-Moment Graph Neural Network towards Modeling Neighborhood Feature Distribution	Aug 15, 2022	Graph Neural NetworkGraph Representation Learning	CodeCode Available
MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue	Jun 19, 2022	Dialogue EvaluationMME	—Unverified
Machine Learning Methods for Inferring the Number of UAV Emitters via Massive MIMO Receive Array	Mar 2, 2022	ClassificationMME	—Unverified
Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation	Apr 9, 2020	Domain AdaptationMeta-Learning	—Unverified
Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition	Aug 1, 2019	Language IdentificationMME	—Unverified
Deep Learning for Hybrid 5G Services in Mobile Edge Computing Systems: Learn from a Digital Twin	Jun 30, 2019	Edge-computingManagement	—Unverified
Scalable K-Medoids via True Error Bound and Familywise Bandits	May 27, 2019	ClusteringMME	—Unverified

Show:10 25 50

← PrevPage 2 of 2Next →

No leaderboard results yet.