| BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation | May 19, 2025 | Binary ClassificationDeepFake Detection | CodeCode Available | 1 | 5 |
| SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Apr 17, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 | 5 |
| MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| A Refer-and-Ground Multimodal Large Language Model for Biomedicine | Jun 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Classification and Detection | Dec 20, 2024 | Cancer ClassificationChatbot | CodeCode Available | 1 | 5 |
| Period-LLM: Extending the Periodic Capability of Multimodal Large Language Model | May 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences | Jan 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation | Oct 17, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| 3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding | Jan 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis | Jun 23, 2025 | DiagnosticLarge Language Model | CodeCode Available | 1 | 5 |
| ProteinGPT: Multimodal LLM for Protein Property Prediction and Structure Understanding | Aug 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution | May 27, 2025 | 8kAvg | CodeCode Available | 1 | 5 |
| From Text to Pixel: Advancing Long-Context Understanding in MLLMs | May 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| LMEye: An Interactive Perception Network for Large Language Models | May 5, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors | Jun 20, 2024 | 16kInstruction Following | CodeCode Available | 1 | 5 |
| LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models | Apr 1, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations | Dec 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| FinVis-GPT: A Multimodal Large Language Model for Financial Chart Analysis | Jul 31, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection | Apr 16, 2025 | Anomaly DetectionLarge Language Model | CodeCode Available | 1 | 5 |
| FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant | Aug 19, 2024 | DescriptiveFace Swapping | CodeCode Available | 1 | 5 |
| ChemMLLM: Chemical Multimodal Large Language Model | May 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge | Nov 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Meaning Typed Prompting: A Technique for Efficient, Reliable Structured Output Generation | Oct 22, 2024 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 1 | 5 |
| Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning | Nov 18, 2024 | AttributeCompositional Zero-Shot Learning | CodeCode Available | 1 | 5 |
| Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space | Mar 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |