| LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations | Dec 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models | Aug 30, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models | Apr 1, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs | Apr 10, 2025 | Multimodal Large Language ModelTime Series | CodeCode Available | 1 | 5 |
| Hallucination Augmented Contrastive Learning for Multimodal Large Language Model | Dec 12, 2023 | Contrastive LearningHallucination | CodeCode Available | 1 | 5 |
| Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation | Aug 19, 2024 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 1 | 5 |
| EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Hespi: A pipeline for automatically detecting information from hebarium specimen sheets | Oct 11, 2024 | Handwritten Text RecognitionHTR | CodeCode Available | 1 | 5 |
| Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning | Nov 18, 2024 | AttributeCompositional Zero-Shot Learning | CodeCode Available | 1 | 5 |
| Unifying Segment Anything in Microscopy with Multimodal Large Language Model | May 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |