| Towards Human-Level Understanding of Complex Process Engineering Schematics: A Pedagogical, Introspective Multi-Agent Framework for Open-Domain Question Answering | Aug 24, 2024 | knowledge editingOpen-Domain Question Answering | —Unverified | 0 |
| Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption | Aug 23, 2024 | Instruction FollowingKnowledge Distillation | —Unverified | 0 |
| MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs | Aug 21, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering | Aug 21, 2024 | Continual LearningQuestion Answering | CodeCode Available | 0 |
| Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework | Aug 21, 2024 | geo-localizationLanguage Modeling | —Unverified | 0 |
| TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Aug 19, 2024 | GPUMulti-Task Learning | CodeCode Available | 0 |
| FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection | Aug 17, 2024 | Federated LearningMedical Visual Question Answering | CodeCode Available | 0 |
| Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm | Aug 16, 2024 | Decision MakingMedical Visual Question Answering | CodeCode Available | 0 |
| Beyond the Hype: A dispassionate look at vision-language models in medical scenario | Aug 16, 2024 | Question AnsweringSpatial Reasoning | —Unverified | 0 |