| WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image | Dec 3, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models | Dec 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model | Dec 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Realistic Corner Case Generation for Autonomous Vehicles with Multimodal Large Language Model | Nov 29, 2024 | Autonomous VehiclesLanguage Modeling | —Unverified | 0 |
| OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection | Nov 26, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Multimodal large language model for wheat breeding: a new exploration of smart breeding | Nov 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model | Nov 19, 2024 | Decision MakingLanguage Modeling | —Unverified | 0 |
| Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model | Nov 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CUE-M: Contextual Understanding and Enhanced Search with Multimodal Large Language Model | Nov 19, 2024 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning | Nov 18, 2024 | AttributeCompositional Zero-Shot Learning | CodeCode Available | 1 |