| MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning | Sep 9, 2024 | Federated LearningImage Captioning | —Unverified | 0 |
| TextToucher: Fine-Grained Text-to-Touch Generation | Sep 9, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| A Medical Multimodal Large Language Model for Pediatric Pneumonia | Sep 4, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image Editing | Sep 2, 2024 | Image GenerationLanguage Modelling | —Unverified | 0 |
| Balancing Performance and Efficiency: A Multimodal Large Language Model Pruning Method based Image Text Interaction | Sep 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model | Sep 1, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| OrthoDoc: Multimodal Large Language Model for Assisting Diagnosis in Computed Tomography | Aug 30, 2024 | Computed Tomography (CT)Diagnostic | —Unverified | 0 |
| MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models | Aug 30, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 1 |
| AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding | Aug 30, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |