| Mitigating Object Hallucinations via Sentence-Level Early Intervention | Jul 16, 2025 | HallucinationMM-Vet | CodeCode Available | 1 |
| MR. Judge: Multimodal Reasoner as a Judge | May 19, 2025 | MM-VetMultiple-choice | —Unverified | 0 |
| EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models | Mar 19, 2025 | MM-VetMultimodal Reasoning | —Unverified | 0 |
| EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models | Jan 1, 2025 | MM-VetMultimodal Reasoning | —Unverified | 0 |
| Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition | Dec 12, 2024 | EgoSchema | CodeCode Available | 3 |
| Attention Prompting on Image for Large Vision-Language Models | Sep 25, 2024 | MM-VetVisual Prompting | CodeCode Available | 2 |
| CogVLM2: Visual Language Models for Image and Video Understanding | Aug 29, 2024 | MM-VetMVBench | CodeCode Available | 9 |
| MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities | Aug 1, 2024 | MathMM-Vet | CodeCode Available | 3 |
| Self-Supervised Visual Preference Alignment | Apr 16, 2024 | 8kMM-Vet | CodeCode Available | 2 |
| OmniFusion Technical Report | Apr 9, 2024 | MM-VetTextVQA | CodeCode Available | 0 |