| Audio-Visual LLM for Video Understanding | Dec 11, 2023 | AudioCapsLanguage Modeling | —Unverified | 0 | 0 |
| Automated radiotherapy treatment planning guided by GPT-4Vision | Jun 21, 2024 | In-Context LearningLanguage Modelling | —Unverified | 0 | 0 |
| Balancing Performance and Efficiency: A Multimodal Large Language Model Pruning Method based Image Text Interaction | Sep 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering | May 17, 2025 | Document RankingLarge Language Model | —Unverified | 0 | 0 |
| Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform | Jan 1, 2025 | Code GenerationImage Generation | —Unverified | 0 | 0 |
| BlueLM-2.5-3B Technical Report | Jul 8, 2025 | Large Language ModelMultimodal Large Language Model | —Unverified | 0 | 0 |
| CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches | Sep 26, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring | May 20, 2025 | Automated Essay ScoringDiversity | —Unverified | 0 | 0 |
| Can Multimodal Large Language Model Think Analogically? | Nov 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models | Nov 11, 2024 | 2D Pose EstimationCategory-Agnostic Pose Estimation | —Unverified | 0 | 0 |