| FullAnno: A Data Engine for Enhancing Image Comprehension of MLLMs | Sep 20, 2024 | Image CaptioningImage Comprehension | —Unverified | 0 |
| IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web | Sep 14, 2024 | Image Comprehension | —Unverified | 0 |
| MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models | Aug 5, 2024 | Image ComprehensionMultiple-choice | CodeCode Available | 2 |
| Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation | Aug 1, 2024 | HallucinationImage Comprehension | —Unverified | 0 |
| Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs | Jul 31, 2024 | HallucinationImage Comprehension | CodeCode Available | 1 |
| InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output | Jul 3, 2024 | ArticlesImage Comprehension | CodeCode Available | 0 |
| Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP | Jun 30, 2024 | HallucinationImage Comprehension | —Unverified | 0 |
| VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning | Jun 20, 2024 | Image ComprehensionQuestion Answering | CodeCode Available | 0 |
| Multiplane Prior Guided Few-Shot Aerial Scene Rendering | Jun 7, 2024 | Image ComprehensionNeRF | —Unverified | 0 |
| Enhancing Large Vision Language Models with Self-Training on Image Comprehension | May 30, 2024 | Image ComprehensionVisual Question Answering | CodeCode Available | 2 |