SOTAVerified

MM-Vet

Papers

Showing 119 of 19 papers

TitleStatusHype
Mitigating Object Hallucinations via Sentence-Level Early InterventionCode1
MR. Judge: Multimodal Reasoner as a Judge0
EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models0
EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models0
Lyra: An Efficient and Speech-Centric Framework for Omni-CognitionCode3
Attention Prompting on Image for Large Vision-Language ModelsCode2
CogVLM2: Visual Language Models for Image and Video UnderstandingCode9
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated CapabilitiesCode3
Self-Supervised Visual Preference AlignmentCode2
OmniFusion Technical ReportCode0
ShapeLLM: Universal 3D Object Understanding for Embodied InteractionCode3
Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning on Language ModelsCode1
DIEM: Decomposition-Integration Enhancing Multimodal Insights0
CogAgent: A Visual Language Model for GUI AgentsCode5
Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?Code1
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided RevisionCode1
To See is to Believe: Prompting GPT-4V for Better Visual Instruction TuningCode2
Enhancing the Spatial Awareness Capability of Multi-Modal Large Language Model0
MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesCode2
Show:102550

No leaderboard results yet.