| CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation | Sep 24, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference | Sep 18, 2024 | Image CaptioningLarge Language Model | —Unverified | 0 |
| Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles | Sep 10, 2024 | Autonomous VehiclesLanguage Modeling | —Unverified | 0 |
| MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding | Sep 10, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning | Sep 9, 2024 | Federated LearningImage Captioning | —Unverified | 0 |
| A Medical Multimodal Large Language Model for Pediatric Pneumonia | Sep 4, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image Editing | Sep 2, 2024 | Image GenerationLanguage Modelling | —Unverified | 0 |
| Balancing Performance and Efficiency: A Multimodal Large Language Model Pruning Method based Image Text Interaction | Sep 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model | Sep 1, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| OrthoDoc: Multimodal Large Language Model for Assisting Diagnosis in Computed Tomography | Aug 30, 2024 | Computed Tomography (CT)Diagnostic | —Unverified | 0 |