| Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization | Nov 15, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 | 0 |
| MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning | Sep 9, 2024 | Federated LearningImage Captioning | —Unverified | 0 | 0 |
| MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation | Mar 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval | May 26, 2025 | Image RetrievalLarge Language Model | —Unverified | 0 | 0 |
| MLLMReID: Multimodal Large Language Model-based Person Re-identification | Jan 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| MMMModal -- Multi-Images Multi-Audio Multi-turn Multi-Modal | Feb 17, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation | Feb 17, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| MobileFlow: A Multimodal LLM For Mobile GUI Agent | Jul 5, 2024 | Action AnalysisLanguage Modelling | —Unverified | 0 | 0 |
| MoChat: Joints-Grouped Spatio-Temporal Grounding LLM for Multi-Turn Motion Comprehension and Description | Oct 15, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills | May 9, 2025 | Image RetouchingLarge Language Model | —Unverified | 0 | 0 |